Anthropic Messages Compatibility

Use this page when a client expects Anthropic Messages API wire format.

Which endpoint should I use?

Endpoint Best for Tool behavior
/v1/anthropic/messages Anthropic-compatible clients Client-managed tools. The API mirrors Anthropic Messages semantics and does not run the Hatz harness.
/v1/anthropic/messages/count_tokens Anthropic-compatible token preflight Returns an input token count for limit checks. Billing uses the final model usage event.
/v1/chat/completions Hatz-native assistant workflows Hatz harness enabled: recursive tool calling, server-side tools, and Hatz-specific orchestration.

The Anthropic SDK appends /v1/messages to its base URL. Configure SDK clients with:

https://ai.hatz.ai/v1/anthropic

The API also accepts /v1/anthropic/v1/messages and /v1/anthropic/v1/messages/count_tokens for that SDK path.

Model IDs

Use Hatz model IDs from your tenant's model list. For Claude Desktop gateway compatibility, the route also accepts the shorthand aliases sonnet, opus, and haiku and resolves them to Hatz model IDs before execution.

curl 'https://ai.hatz.ai/v1/chat/models' \
  -H "Authorization: Bearer $HATZ_API_KEY"

Examples of valid Hatz IDs include anthropic.claude-haiku-4-5 and moonshot.kimi-k2-thinking when enabled for your tenant.

By default, shorthand aliases resolve to the current enabled Anthropic model in the matching family. Users can enable Custom Anthropic Gateway Model Mapping in Hatz AI Preferences to override each alias to another enabled Hatz model. Turning the setting off preserves saved mappings but makes the gateway use the default family models again.

Alias Default behavior
sonnet Current enabled Claude Sonnet family model unless custom mapping is enabled
opus Current enabled Claude Opus family model unless custom mapping is enabled
haiku Current enabled Claude Haiku family model unless custom mapping is enabled

Native Claude-style IDs that map to enabled Hatz models, such as claude-sonnet-4-6 and claude-opus-4-7, are also accepted for Claude Desktop compatibility.

Authentication

This endpoint accepts the same authentication methods as all Hatz API endpoints:

  • Authorization: Bearer <your-api-key>
  • X-API-Key: <your-api-key>

Anthropic SDK Example

import anthropic
import os

client = anthropic.Anthropic(
    base_url="https://ai.hatz.ai/v1/anthropic",
    api_key=os.environ["HATZ_API_KEY"],
)

message = client.messages.create(
    model="anthropic.claude-haiku-4-5",
    max_tokens=128,
    messages=[{"role": "user", "content": "Reply with one short sentence."}],
)

print(message.content[0].text)

curl Example

curl 'https://ai.hatz.ai/v1/anthropic/messages' \
  -H 'Authorization: Bearer '"$HATZ_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "anthropic.claude-haiku-4-5",
    "max_tokens": 128,
    "messages": [
      {
        "role": "user",
        "content": "Write a 2 sentence summary of SOC 2."
      }
    ]
  }'

Streaming

Set stream: true to receive Anthropic-style named SSE events:

message_start
content_block_start
content_block_delta
content_block_stop
message_delta
message_stop

Provider errors are returned as event: error with an Anthropic-compatible error payload.

Tool Calling

Request tools use Anthropic tools and tool_choice shapes. The server maps them into the runtime provider's supported tool format, and the client remains responsible for executing tools and sending tool_result blocks in a follow-up request.

web_search_20250305 is accepted for Claude Desktop compatibility. It is controlled by the Statsig gate anthropic_messages_web_search_enabled; ANTHROPIC_MESSAGES_WEB_SEARCH_ENABLED=true|false can force local override behavior.

When the flag is disabled, the gateway returns a 200 response with Anthropic-style server_tool_use and web_search_tool_result blocks that report search as unavailable in this environment.

When the flag is enabled, the gateway runs Hatz-hosted Firecrawl search, injects the search results into model context, and prefixes the response with Anthropic-style web search result blocks. The provider-facing request does not receive Anthropic's built-in search tool directly.

Token Counting

/v1/anthropic/messages/count_tokens is intended for client-side limit checks. Hatz may use a fast local estimate with a safety multiplier, or an exact provider count for multimodal, structured, tool-heavy, or near-limit prompts. Usage accounting is recorded from the final model response, not from this preflight endpoint.

Prompt Caching

Valid cache_control blocks are preserved on message content and tool definitions before the request is handed to the provider runtime. Cache read/write token counts are included in usage metadata when the provider returns them.