Compare | Mesh API Docs

Use POST /v1/chat/compare when you want to run the same conversation across multiple models and inspect the results side by side.

How it works

Fan-out: All requested models are called concurrently. The total wall-clock time is roughly that of the slowest model, not the sum of all models.
Error isolation: If a single model fails or times out (hard timeout of 120s), the others continue unaffected. Partial results are returned with a partial: true flag.
Synthesis (default): After all models respond, a separate comparison LLM analyzes the responses and produces a structured evaluation covering accuracy, completeness, clarity, and a recommendation.
Skip synthesis (optional): By setting skip_comparison: true, you can skip the synthesis step and receive only the raw model outputs. This is useful for parallel streaming UIs that perform their own comparison.
Rate limiting and Billing: The entire comparison counts as a single request against your rate limits (RPM/RPD). However, billing tracks each model call plus the comparison call as separate usage events (N+1 events).
Streaming: Two streaming modes are available by setting stream: true. With synthesis enabled, fan-out is non-streaming, but the final comparison text is streamed token-by-token. If skip_comparison: true is set, each fan-out model streams its tokens in real-time concurrently, tagged by model name.

Basic request

curl

Node.js SDK

Python SDK

Go SDK

Java SDK

$ curl https://api.meshapi.ai/v1/chat/compare \
>   -H "Authorization: Bearer <YOUR_RSK_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "models": [
>       "openai/gpt-4o-mini",
>       "anthropic/claude-haiku-4.5"
>     ],
>     "messages": [
>       { "role": "user", "content": "Explain vector search in two sentences." }
>     ]
>   }'

Request fields

Field	Type	Notes
`models`	string[]	Models to compare. Min 1, max 10. Duplicates are removed.
`messages`	chat message[]	Conversation sent to each model.
`temperature`	number	Optional sampling temperature applied to every model.
`max_tokens`	integer	Optional max completion tokens applied to every model.
`model_overrides`	object	Optional per-model parameter overrides, keyed by model ID.
`template`	string	Optional prompt template ID to render for each model.
`variables`	object	Optional variables to fill the `template`.
`cache`	boolean	Optional prompt-caching toggle.
`comparison_model`	string	Optional model used for synthesized comparison output.
`comparison_instructions`	string	Optional comparison rubric or guidance.
`skip_comparison`	boolean	Return per-model outputs without synthesized comparison text.
`stream`	boolean	Optional streaming mode.

Response shape

The response includes:

the compared model list
one result per model
optional synthesized comparison text
latency and request metadata

Streaming (SSE)

Set "stream": true to receive a text/event-stream with typed events. There are two streaming modes:

Mode 1: With comparison (`skip_comparison: false`, default)

Fan-out models are non-streaming (full response collected per model), then the comparison LLM streams token-by-token.

Event	When	Payload
`meta`	Immediately after auth	`{"comparison_id", "models", "comparison_model", "skip_comparison": false}`
`model_chunk`	As each fan-out model finishes	`{"model", "delta", "latency_ms", "error", "error_code", "usage"}`
`model_done`	All fan-out results collected	`{"results": [...]}`
`comparison_chunk`	During comparison LLM streaming	`{"delta": "<token>", "finish_reason": null \| "stop"}`
`done`	All complete	`{"comparison_id", "total_latency_ms", "partial", "comparison_model", "comparison_fallback_used"}`

Mode 2: Skip comparison (`skip_comparison: true`)

Each fan-out model streams tokens in real time concurrently, tagged by model name. No comparison LLM is called.

Event	When	Payload
`meta`	Immediately after auth	`{"comparison_id", "models", "comparison_model": null, "skip_comparison": true}`
`model_chunk`	Each token from any model	`{"model": "...", "delta": "Hello", "finish_reason": null}`
`model_stream_done`	One model’s stream ends	`{"model": "...", "finish_reason": "stop", "usage": {...}, "error": null \| "..."}`
`done`	All models finished	`{"comparison_id", "total_latency_ms", "partial", "skip_comparison": true}`

SDK coverage

Node: client.compare.create(...)
Python: client.compare.create(...)
Go: client.Compare.Create(...)
Java: client.compare().create(...)

Streaming compare is also available in the SDKs through their compare stream methods when you want incremental events instead of a single final JSON response.

When to use compare

Evaluate multiple candidate models for a task
Compare cost/quality trade-offs before choosing a default
Build internal prompt evaluations with a stable request shape