Chat Completions

The Chat Completions API works to mimic the functionality of the Hatz AI Chat Platform. It works very similarly to the OpenAI and Anthropic completions API, where you can just about swap to use it as a drop-in replacement.

If you need strict OpenAI Responses compatibility for OpenCode or AI SDK OpenAI-provider flows, use /v1/openai/responses instead. This chat route is Hatz-native and uses the Hatz harness (including recursive tool calling and server-side tools).

List of Available Models

You can use the /v1/chat/models endpoint to get a list of available models that are available to use.

The name field of each model is the field (model id) that will be used within your requests.

The display_name field is the name of the actual model

The vision boolean dictates whether the model has image capabilities (recognizing image files)

Example Query

curl 'https://ai.hatz.ai/v1/chat/models' \
  -H 'X-API-Key: $HATZ_API_KEY'

Response:

{
    "data": [
        {
            "name": "gpt-3.5-turbo",
            "developer": "OpenAI",
            "display_name": "GPT 3.5 Turbo",
            "max_tokens": 16385,
            "vision": false
        },
        {
            "name": "gpt-4",
            "developer": "OpenAI",
            "display_name": "GPT 4",
            "max_tokens": 8192,
            "vision": false
        },
... rest of data
    ]
}

Auto Model Routing

Pass "model": "auto" and the system will classify your prompt at request time and route it to the best-fitting underlying model. The classifier picks a (complexity, task_type) pair and the router resolves it against an internal table of model picks per tier.

Modes

Each user has a saved auto-LLM mode that controls how aggressively the router scales up. Modes are managed in the Hatz dashboard under Settings → AI Preferences:

Mode	Behavior	Best for
`lite`	Always uses small, fast, low-cost models. Even complex prompts are clamped to the cheapest tier.	Bulk / high-volume workloads where cost matters more than peak quality.
`performance`	Uses small models for simple prompts and larger models for complex ones. Never uses the top "extra-high" reasoning tier.	Default. Balanced cost / capability for most workloads.
`turbo`	Always uses at least a mid-tier model and unlocks the top "extra-high" tier (e.g. Claude Opus) for hard reasoning / coding prompts.	Production workloads where quality matters more than cost.

Whichever mode is saved on your account is what the API uses when you pass "model": "auto". To change modes, update Default Auto LLM mode in your AI Preferences. There is currently no per-request mode override on the API.

Example

curl 'https://ai.hatz.ai/v1/chat/completions' \
  -H 'X-API-Key: $HATZ_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "auto",
    "messages": [
      { "role": "user", "content": "Write a Python function to compute fibonacci numbers efficiently." }
    ]
  }'

The response is identical in shape to a normal completion — the underlying model is chosen for you. If you need strict reproducibility, pin a specific model from /v1/chat/models instead of using "auto".

Notes

Org-level mode whitelists (set by your MSP / tenant admin) do restrict the public API auto-routing path. If your saved mode is no longer allowed by the resolved MSP ∩ tenant ∩ role whitelist, the router silently downshifts to the strictest still-allowed mode (cheapest by enum order). If the whitelist resolves to empty, "auto" requests fail with HTTP 429.
Image-generation prompts under "auto" route to a Gemini image model. To control the image tier directly, pass gemini-3.1-flash-image-preview or gemini-3-pro-image-preview as the model.
If the router can't find any usable model for your account (e.g. credit-band rate limits), the request fails with HTTP 429 — the same error you'd get pinning a rate-limited model directly.

Agents

Agents are pre-configured AI assistants with custom instructions, tools, and knowledge sources. You can list available agents and use them in completions requests.

List Agents

Use the /v1/chat/agents endpoint to get a list of agents available to your user.

curl 'https://ai.hatz.ai/v1/chat/agents' \
  -H 'X-API-Key: $HATZ_API_KEY'

Response:

{
    "data": [
        {
            "id": "a1b2c3d4-5678-9abc-def0-1234567890ab",
            "name": "Research Assistant",
            "description": "An agent specialized in web research and summarization",
            "instructions": "You are a research assistant...",
            "tools": ["google_search", "tavily_search"],
            "sources": []
        }
    ]
}

Using an Agent in Completions

To use an agent, set the model field to agent-{agent_id} where agent_id is the id from the agents list:

curl 'https://ai.hatz.ai/v1/chat/completions' \
  -H 'X-API-Key: $HATZ_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Research the latest developments in quantum computing"
      }
    ],
    "model": "agent-a1b2c3d4-5678-9abc-def0-1234567890ab",
    "stream": false
  }'

When using an agent, the agent's pre-configured instructions, tools, and sources are automatically applied to the request. You do not need to specify tools_to_use separately — the agent's configured tools will be used.

Message Structure

Messages are in the JSON Body of the request

You have the following types of messages

System Message

You can only have one system message within the array. The system message is developer-provided instructions that the model should follow, regardless of messages sent by the user.

User Message

Messages sent by an end user, containing prompts or additional context information.

Assistant Message

Messages sent by the model in response to user messages.

Example Message Array

"messages": [
  {
    "role": "system",
    "content": "Send responses with humor"
  },
  {
    "role": "user",
    "content": "Can you describe the difference between different text file types"
  }
],

Request Parameters

Parameter	Type	Required	Default	Description
`messages`	array	Yes	—	The list of messages to process (see Message Structure above)
`model`	string	No	`gpt-4o`	The AI model to use. Use the `name` from `/chat/models`, `agent-{id}` for agents, or `"auto"` to let the system pick a model per request (see Auto Model Routing above)
`stream`	boolean	No	`false`	Whether to stream the response via Server-Sent Events
`file_uuids`	array of UUIDs	No	`[]`	File UUIDs from the file upload endpoint to include as context
`tools_to_use`	array of strings	No	`[]`	List of tool names to enable (see Available Tools below)
`auto_tool_selection`	boolean	No	`false`	Let the system automatically select relevant tools based on the message

Available Tools

You can enable tools by passing their names in the tools_to_use array.

Web Search

Tool Name	Description
`google_search`	Google Custom Search API
`firecrawl_search`	Web search via Firecrawl
`firecrawl_scrape`	Web page scraping via Firecrawl
`firecrawl_extract`	AI data extraction from web pages via Firecrawl
`tavily_search`	Web search via Tavily
`tavily_qna`	Web search with AI-generated answers via Tavily
`exa_search`	Neural web search via Exa
`exa_answer`	Neural web search with AI answers via Exa
`perplexity_search`	Web search via Perplexity
`perplexity_ask`	Web search and AI answers via Perplexity

Code Execution

Tool Name	Description
`daytona_code_execution`	Enables all code execution tools below
`run_python_script`	Execute Python code in a sandboxed environment
`execute_shell_command`	Execute shell commands in a sandboxed environment
`list_files`	List files in the sandbox
`sandbox_download_file`	Download a file from the sandbox
`upload_file_from_url`	Upload a file to the sandbox from a URL

Location & Weather

Tool Name	Description
`google_maps_text_search`	Search for places by text query
`google_maps_nearby_search`	Search for places near a location
`google_maps_directions`	Get directions between locations
`google_maps_route`	Get route information between locations
`google_weather_current`	Get current weather conditions
`google_weather_forecast`	Get weather forecast

Example: Using Tools

curl 'https://ai.hatz.ai/v1/chat/completions' \
  -H 'X-API-Key: $HATZ_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Search the web for the latest news about AI regulation"
      }
    ],
    "model": "gpt-4o",
    "stream": false,
    "tools_to_use": ["google_search"]
  }'

Example: Auto Tool Selection

Instead of specifying tools manually, you can let the system choose:

curl 'https://ai.hatz.ai/v1/chat/completions' \
  -H 'X-API-Key: $HATZ_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Write a Python script to analyze this CSV data"
      }
    ],
    "model": "gpt-4o",
    "stream": false,
    "auto_tool_selection": true,
    "file_uuids": ["8fb5bc1d-5a8d-4015-86a6-b0ca394e7793"]
  }'

Example: Code Execution with Files

curl 'https://ai.hatz.ai/v1/chat/completions' \
  -H 'X-API-Key: $HATZ_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Read the uploaded CSV and generate a summary chart"
      }
    ],
    "model": "gpt-4o",
    "stream": false,
    "tools_to_use": ["daytona_code_execution"],
    "file_uuids": ["8fb5bc1d-5a8d-4015-86a6-b0ca394e7793"]
  }'

Basic Completions Example

Example Query

curl 'https://ai.hatz.ai/v1/chat/completions' \
  -H 'X-API-Key: $HATZ_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {
        "role": "system",
        "content": "Send responses with a humor"
      },
      {
        "role": "user",
        "content": "Can you describe the difference between different text file types"
      }
    ],
    "model": "gpt-4o",
    "stream": false
  }'

Response:

{
  "choices": [
    {
      "message": {
        "content": "Sure thing! Think of text file types as different personalities at a party. \n\n1. **.txt** - This is the plain Jane of text files. No frills, no fuss, just straight-up text. If it were a person, it would be the one wearing a \"Hello, my name is...\" sticker.\n\n2. **.doc/.docx** - The Microsoft Word files love to dress up. They've got fonts, colors, and sometimes even clip art. They're like the person who shows up in a tuxedo when everyone else is in jeans.\n\n3. **.pdf** - The PDF is the bouncer of the text file world. Once something's in PDF form, it's locked down and hard to edit. It's like the friend who says, \"We're not changing the plan!\"\n\n4. **.rtf** - Rich Text Format files try to please everyone. They have some basic formatting, like bold or italics, but they don't go overboard. Imagine someone who brings a cheese platter to a potluck—not too fancy, but not boring either.\n\n5. **.md** - Markdown files are the hipsters of text files. They're minimalistic and love to hang around in tech circles, pretending they don't care about appearances, but they still look surprisingly good on the web.\n\n6. **.html** - HTML files are the social butterflies. They're designed to interact with everyone on the web. They're the ones at the party gossiping about the latest viral meme and can show you pictures, videos, and more.\n\n7. **.csv** - Comma-separated values files are the nerds with a heart of gold. They might look like a jumbled mess of data at first, but they're great at organizing info into spreadsheets. Think pocket protectors and glasses, but they're the ones you'll hire to do your taxes.\n\nEach type has its quirks, but they all have their place in the digital world!",
        "role": "assistant"
      }
    }
  ],
  "model": "gpt-4o",
  "usage": {
    "input_tokens": 26,
    "output_tokens": 398
  }
}

Hatz AI REST API

OpenAI Responses