Create a chat completion

Generates a model response from a list of chat messages.

This endpoint is intentionally shaped like OpenAI’s POST /v1/chat/completions so existing SDK integrations can switch base URLs with minimal change.

Mesh-specific request extensions:

template: resolve a stored prompt template by name or UUID
variables: values used to render {{slot}} placeholders
session_id: caller-defined grouping key for usage reporting

Streaming responses are returned as Server-Sent Events when stream=true.

Generates a model response from a list of chat messages. This endpoint is intentionally shaped like OpenAI's `POST /v1/chat/completions` so existing SDK integrations can switch base URLs with minimal change. Mesh-specific request extensions: - `template`: resolve a stored prompt template by name or UUID - `variables`: values used to render `{{slot}}` placeholders - `session_id`: caller-defined grouping key for usage reporting Streaming responses are returned as Server-Sent Events when `stream=true`.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Request

This endpoint expects an object.

messageslist of objectsRequired

Conversation history in OpenAI chat format.

modelstring or nullOptionalDefaults to openai/gpt-4o

Model identifier. If omitted, the backend resolves it from the API key default model or the selected template.

templatestring or nullOptional

Template name or UUID to expand before inference.

variablesmap from strings to strings or nullOptional

Values used when rendering {{slot}} placeholders.

session_idstring or nullOptional

Caller-defined grouping key for usage reporting.

streambooleanOptionalDefaults to false

When true, returns SSE chunks instead of a JSON object.

temperaturedouble or nullOptional

Sampling temperature used for token selection. Lower values make output more deterministic; higher values increase randomness and variation.

max_tokensinteger or nullOptional>=1

Maximum number of completion tokens to generate.

top_pdouble or nullOptional

Nucleus sampling threshold. The model samples from the smallest set of tokens whose cumulative probability reaches top_p.

frequency_penaltydouble or nullOptional

Penalizes tokens that have already appeared in the generated output, reducing repeated phrasing.

presence_penaltydouble or nullOptional

Penalizes tokens that have already appeared at least once, nudging the model toward introducing new topics.

stopstring or list of strings or nullOptional

One or more stop sequences that end generation when encountered.

seedinteger or nullOptional

Optional seed for best-effort deterministic sampling across repeated requests with the same parameters.

toolslist of objects or nullOptional

Tool definitions the model may call during the completion.

tool_choiceenum or object or nullOptional

Controls whether the model can call tools automatically, must avoid them, must call one, or must call a specific tool.

transformslist of strings or nullOptional

Ordered OpenRouter transforms applied before inference.

modelslist of strings or nullOptional

Ordered OpenRouter fallback model list.

userstring or nullOptional<=256 characters

End-user identifier forwarded for abuse monitoring.

Response headers

X-Request-Idstring

Unique request identifier for tracing and support.

Response

Successful response.

Non-streaming requests return a ChatCompletionResponse object. Streaming requests return text/event-stream chunks matching ChatCompletionChunk, followed by data: [DONE].

idstring

object"chat.completion"

createdinteger

modelstring

choiceslist of objects

usageobject or null

system_fingerprintstring or null

Errors

401

Unauthorized Error

402

Payment Required Error

403

Forbidden Error

404

Not Found Error

422

Unprocessable Entity Error

429

Too Many Requests Error

502

Bad Gateway Error

503

Service Unavailable Error

504

Gateway Timeout Error

1	curl -X POST https://api.meshapi.ai/v1/chat/completions \
2	-H "Authorization: Bearer <token>" \
3	-H "Content-Type: application/json" \
4	-d '{
5	"messages": [
6	{
7	"role": "system",
8	"content": "You are a concise assistant."
9	},
10	{
11	"role": "user",
12	"content": "Explain vector databases in two sentences."
13	}
14	],
15	"model": "openai/gpt-4o-mini",
16	"temperature": 0.2,
17	"max_tokens": 120
18	}'

1	{
2	"id": "chatcmpl-abc123",
3	"object": "chat.completion",
4	"created": 1712345678,
5	"model": "openai/gpt-4o-mini",
6	"choices": [
7	{
8	"index": 0,
9	"message": {
10	"role": "assistant",
11	"content": "Vector databases store embeddings for similarity search and retrieval. They make semantic lookup fast enough to use inside production AI applications."
12	},
13	"finish_reason": "stop"
14	}
15	],
16	"usage": {
17	"prompt_tokens": 31,
18	"completion_tokens": 24,
19	"total_tokens": 55,
20	"prompt_tokens_details": {
21	"cached_tokens": 0
22	}
23	}
24	}