Audio | Mesh API Docs

Mesh API supports audio through POST /v1/chat/completions.

Use this page for:

audio input with input_audio
audio output with modalities and audio

Audio input

Send audio as a content part inside a chat message.

curl

Node.js SDK

Python SDK

Go SDK

Java SDK

$ curl https://api.meshapi.ai/v1/chat/completions \
>   -H "Authorization: Bearer <YOUR_RSK_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "model": "google/gemini-3-flash-preview",
>     "messages": [
>       {
>         "role": "user",
>         "content": [
>           { "type": "text", "text": "Transcribe this clip." },
>           {
>             "type": "input_audio",
>             "input_audio": {
>               "data": "<BASE64_AUDIO>",
>               "format": "wav"
>             }
>           }
>         ]
>       }
>     ]
>   }'

Audio output

Request text and audio together when the model supports audio output.

$ curl https://api.meshapi.ai/v1/chat/completions \
>   -H "Authorization: Bearer <YOUR_RSK_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "model": "openai/gpt-4o-audio-preview",
>     "messages": [
>       { "role": "user", "content": "Read this back to me in a calm voice." }
>     ],
>     "modalities": ["text", "audio"],
>     "audio": {
>       "voice": "alloy",
>       "format": "wav"
>     }
>   }'

The same request shape is available through all four SDKs by setting chat-completions fields for modalities and audio.

Translate audio to English

POST /v1/audio/translations accepts audio in any language and returns the speech translated to English. It returns the same TranscriptionResponse (with a .text field) as transcription.

This is a distinct endpoint from the transcribe-and-translate helper at POST /v1/audio/transcriptions/translate. Check GET /v1/models for models that support translation — model is required.

Python SDK

Node.js SDK

Go SDK

1 from meshapi import MeshAPI, AudioTranslationsParams
2 
3 client = MeshAPI(base_url="https://api.meshapi.ai", token="rsk_...")
4 
5 with open("french_audio.mp3", "rb") as f:
6     audio_bytes = f.read()
7 
8 result = client.audio.audio_translate(
9     audio_bytes,
10     AudioTranslationsParams(model="openai/whisper-large-v3"),
11     filename="french_audio.mp3",
12 )
13 
14 print(result.text)  # English translation

Optional parameters: prompt (context hint for the model), response_format (json, text, or verbose_json), and temperature (0–2).

SDK coverage

Node: client.chat.completions.create(...)
Python: client.chat.completions.create(...)
Go: client.Chat.Completions.Create(...)
Java: client.chat().completions().create(...)

Notes

Audio payloads are base64 encoded in the request body.
Check GET /v1/models to find models that accept or produce audio.
Keep payload sizes reasonable, especially for browser-based clients.