> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://developers.meshapi.ai/llms.txt.
> For full documentation content, see https://developers.meshapi.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://developers.meshapi.ai/_mcp/server.

# Auto Routing

> Dynamically route requests to the best model.

Mesh API's Auto Router allows you to simply set `model: "auto"` on any inference request. The gateway will classify the request using an internal LLM and select the most appropriate model from the live registry, forwarding the request transparently.

This requires no client-side logic beyond setting the `model` field to `"auto"`.

## Supported endpoints

Auto Routing is supported across the following inference endpoints:

| Endpoint                    | Streaming Supported |
| --------------------------- | ------------------- |
| `POST /v1/chat/completions` | Yes                 |
| `POST /v1/responses`        | Yes                 |
| `POST /v1/embeddings`       | No                  |

## Basic request

Just replace your specific model ID with `"auto"`:

```bash
curl https://api.meshapi.ai/v1/chat/completions \
  -H "Authorization: Bearer <YOUR_RSK_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Write a Python quicksort implementation"}]
  }'
```

```ts
const response = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Write a Python quicksort implementation" }],
});
```

```python
from meshapi import ChatMessage, ChatCompletionParams, MeshAPI

client = MeshAPI(base_url="https://api.meshapi.ai", token="rsk_...")

response = client.chat.completions.create(
    ChatCompletionParams(
        model="auto",
        messages=[ChatMessage(role="user", content="Write a Python quicksort implementation")],
    )
)
```

```go
params := meshapi.ChatCompletionParams{
    Model: meshapi.String("auto"),
    Messages: []meshapi.ChatMessage{
        {Role: "user", Content: meshapi.String("Write a Python quicksort implementation")},
    },
}

_, err := client.Chat.Completions.Create(ctx, params)
```

```java
ChatCompletionRequest request = ChatCompletionRequest.builder()
    .model("auto")
    .addMessage(ChatMessage.user("Write a Python quicksort implementation"))
    .build();

client.chat().completions().create(request);
```

## Response metadata

When a request is automatically routed, Mesh API injects metadata into the response so you know which model was actually used.

### Non-streaming requests

The metadata is included directly in the response body.

```json
{
  "id": "chatcmpl-...",
  "model": "openai/gpt-4o",
  "choices": [...],
  "x_auto_routed": true,
  "x_resolved_model_id": "openai/gpt-4o"
}
```

If the internal classifier failed or timed out and a fallback model was used, additional fields are present:

```json
{
  "x_auto_routed": true,
  "x_resolved_model_id": "openai/gpt-4o-mini",
  "x_auto_routed_fallback": true,
  "x_auto_routed_fallback_reason": "classifier_timeout"
}
```

### Streaming requests

For streaming requests (`stream: true`), the metadata is included as HTTP response headers before the SSE stream begins:

```text
X-Auto-Routed: true
X-Resolved-Model-Id: openai/gpt-4o
```

## Fallback behavior

The Auto Router is designed to never block a request due to its own failure. If the internal classification model fails to respond in time or returns an unknown model, the gateway will automatically fall back to a reliable default model (e.g., `openai/gpt-4o-mini`).

## Billing

When using the Auto Router, you are billed for the tokens consumed by the *resolved* model that actually served the request, **as well as the tokens consumed by the internal classifier model**. Both will appear in your usage dashboard.