Anthropic Messages Compatibility
Use this page when a client expects Anthropic Messages API wire format.
Which endpoint should I use?
| Endpoint | Best for | Tool behavior |
|---|---|---|
/v1/anthropic/messages |
Anthropic-compatible clients | Client-managed tools. The API mirrors Anthropic Messages semantics and does not run the Hatz harness. |
/v1/anthropic/messages/count_tokens |
Anthropic-compatible token preflight | Returns an input token count for limit checks. Billing uses the final model usage event. |
/v1/chat/completions |
Hatz-native assistant workflows | Hatz harness enabled: recursive tool calling, server-side tools, and Hatz-specific orchestration. |
The Anthropic SDK appends /v1/messages to its base URL. Configure SDK clients with:
https://ai.hatz.ai/v1/anthropic
The API also accepts /v1/anthropic/v1/messages and /v1/anthropic/v1/messages/count_tokens for that SDK path.
Model IDs
Use Hatz model IDs from your tenant's model list. For Claude Desktop gateway compatibility, the route also accepts the shorthand aliases sonnet, opus, and haiku and resolves them to Hatz model IDs before execution.
curl 'https://ai.hatz.ai/v1/chat/models' \
-H "Authorization: Bearer $HATZ_API_KEY"
Examples of valid Hatz IDs include anthropic.claude-haiku-4-5 and moonshot.kimi-k2-thinking when enabled for your tenant.
By default, shorthand aliases resolve to the current enabled Anthropic model in the matching family. Users can enable Custom Anthropic Gateway Model Mapping in Hatz AI Preferences to override each alias to another enabled Hatz model. Turning the setting off preserves saved mappings but makes the gateway use the default family models again.
| Alias | Default behavior |
|---|---|
sonnet |
Current enabled Claude Sonnet family model unless custom mapping is enabled |
opus |
Current enabled Claude Opus family model unless custom mapping is enabled |
haiku |
Current enabled Claude Haiku family model unless custom mapping is enabled |
Native Claude-style IDs that map to enabled Hatz models, such as claude-sonnet-4-6 and claude-opus-4-7, are also accepted for Claude Desktop compatibility.
Authentication
This endpoint accepts the same authentication methods as all Hatz API endpoints:
Authorization: Bearer <your-api-key>X-API-Key: <your-api-key>
Anthropic SDK Example
import anthropic
import os
client = anthropic.Anthropic(
base_url="https://ai.hatz.ai/v1/anthropic",
api_key=os.environ["HATZ_API_KEY"],
)
message = client.messages.create(
model="anthropic.claude-haiku-4-5",
max_tokens=128,
messages=[{"role": "user", "content": "Reply with one short sentence."}],
)
print(message.content[0].text)
curl Example
curl 'https://ai.hatz.ai/v1/anthropic/messages' \
-H 'Authorization: Bearer '"$HATZ_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "anthropic.claude-haiku-4-5",
"max_tokens": 128,
"messages": [
{
"role": "user",
"content": "Write a 2 sentence summary of SOC 2."
}
]
}'
Streaming
Set stream: true to receive Anthropic-style named SSE events:
message_start
content_block_start
content_block_delta
content_block_stop
message_delta
message_stop
Provider errors are returned as event: error with an Anthropic-compatible error payload.
Tool Calling
Request tools use Anthropic tools and tool_choice shapes. The server maps them into the runtime provider's supported tool format, and the client remains responsible for executing tools and sending tool_result blocks in a follow-up request.
web_search_20250305 is accepted for Claude Desktop compatibility. It is controlled by the Statsig gate anthropic_messages_web_search_enabled; ANTHROPIC_MESSAGES_WEB_SEARCH_ENABLED=true|false can force local override behavior.
When the flag is disabled, the gateway returns a 200 response with Anthropic-style server_tool_use and web_search_tool_result blocks that report search as unavailable in this environment.
When the flag is enabled, the gateway runs Hatz-hosted Firecrawl search, injects the search results into model context, and prefixes the response with Anthropic-style web search result blocks. The provider-facing request does not receive Anthropic's built-in search tool directly.
Token Counting
/v1/anthropic/messages/count_tokens is intended for client-side limit checks. Hatz may use a fast local estimate with a safety multiplier, or an exact provider count for multimodal, structured, tool-heavy, or near-limit prompts. Usage accounting is recorded from the final model response, not from this preflight endpoint.
Prompt Caching
Valid cache_control blocks are preserved on message content and tool definitions before the request is handed to the provider runtime. Cache read/write token counts are included in usage metadata when the provider returns them.