Auto Routing | Mesh API Docs

Mesh API’s Auto Router allows you to simply set model: "auto" on any inference request. The gateway will classify the request using an internal LLM and select the most appropriate model from the live registry, forwarding the request transparently.

This requires no client-side logic beyond setting the model field to "auto".

Supported endpoints

Auto Routing is supported across the following inference endpoints:

Endpoint	Streaming Supported
`POST /v1/chat/completions`	Yes
`POST /v1/responses`	Yes
`POST /v1/embeddings`	No

Basic request

Just replace your specific model ID with "auto":

curl

Node.js SDK

Python SDK

Go SDK

Java SDK

$ curl https://api.meshapi.ai/v1/chat/completions \
>   -H "Authorization: Bearer <YOUR_RSK_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "model": "auto",
>     "messages": [{"role": "user", "content": "Write a Python quicksort implementation"}]
>   }'

Response metadata

When a request is automatically routed, Mesh API injects metadata into the response so you know which model was actually used.

Non-streaming requests

The metadata is included directly in the response body.

1 {
2   "id": "chatcmpl-...",
3   "model": "openai/gpt-4o",
4   "choices": [...],
5   "x_auto_routed": true,
6   "x_resolved_model_id": "openai/gpt-4o"
7 }

If the internal classifier failed or timed out and a fallback model was used, additional fields are present:

1 {
2   "x_auto_routed": true,
3   "x_resolved_model_id": "openai/gpt-4o-mini",
4   "x_auto_routed_fallback": true,
5   "x_auto_routed_fallback_reason": "classifier_timeout"
6 }

Streaming requests

For streaming requests (stream: true), the metadata is included as HTTP response headers before the SSE stream begins:

X-Auto-Routed: true
X-Resolved-Model-Id: openai/gpt-4o

Fallback behavior

The Auto Router is designed to never block a request due to its own failure. If the internal classification model fails to respond in time or returns an unknown model, the gateway will automatically fall back to a reliable default model (e.g., openai/gpt-4o-mini).

Billing

When using the Auto Router, you are billed for the tokens consumed by the resolved model that actually served the request, as well as the tokens consumed by the internal classifier model. Both will appear in your usage dashboard.