Audio

Text-to-Speech

client.audio.synthesize sends POST /v1/audio/speech and returns raw audio bytes.

1 from meshapi import MeshAPI, SpeechParams
2 
3 client = MeshAPI(base_url="https://api.meshapi.ai", token="rsk_...")
4 
5 audio_bytes = client.audio.synthesize(
6     SpeechParams(
7         input="Hello from MeshAPI.",
8         model="sarvam/bulbul:v2",
9         voice="meera",
10     )
11 )
12 
13 with open("output.wav", "wb") as f:
14     f.write(audio_bytes)

Async

1 from meshapi import AsyncMeshAPI, SpeechParams
2 
3 async with AsyncMeshAPI(base_url="https://api.meshapi.ai", token="rsk_...") as client:
4     audio_bytes = await client.audio.synthesize(
5         SpeechParams(
6             input="Hello from MeshAPI.",
7             model="sarvam/bulbul:v2",
8         )
9     )

`SpeechParams` fields

Field	Type	Notes
`input`	`str`	Required. Text to synthesize.
`model`	`str`	Required. e.g. `"sarvam/bulbul:v2"`
`voice`	`str \| None`	Voice ID or name
`response_format`	`str \| None`	Audio format, e.g. `"wav"`, `"mp3"`
`speed`	`float \| None`	Playback speed multiplier

Speech-to-Text (Transcription)

client.audio.transcribe sends POST /v1/audio/transcriptions as a multipart upload and returns a TranscriptionResponse.

1 from meshapi import TranscriptionParams
2 
3 with open("audio.wav", "rb") as f:
4     file_bytes = f.read()
5 
6 result = client.audio.transcribe(
7     file_bytes,
8     TranscriptionParams(
9         model="sarvam/saaras:v3",
10         # Optional: language_code is model-specific (e.g. Sarvam expects "en-IN", not "en").
11     ),
12     filename="audio.wav",
13 )
14 
15 print(result.text)

`TranscriptionParams` key fields

Field	Type	Notes
`model`	`str`	Required. e.g. `"sarvam/saaras:v3"`
`language_code`	`str \| None`	Optional. Model-specific language code (e.g. Sarvam expects `"en-IN"`)
`diarize`	`bool \| None`	Enable speaker diarization
`num_speakers`	`int \| None`	Expected number of speakers
`timestamps_granularity`	`str \| None`	e.g. `"word"` for word-level timestamps
`tag_audio_events`	`bool \| None`	Tag non-speech audio events
`additional_formats`	`str \| None`	Request extra output formats

Translation

client.audio.translate sends POST /v1/audio/transcriptions/translate and returns the audio transcribed and translated to English.

1 from meshapi import TranscriptionTranslateParams
2 
3 with open("audio.wav", "rb") as f:
4     file_bytes = f.read()
5 
6 result = client.audio.translate(
7     file_bytes,
8     TranscriptionTranslateParams(
9         model="sarvam/saaras:v3",
10     ),
11     filename="audio.wav",
12 )
13 
14 print(result.text)

Translation (to English)

client.audio.audio_translate sends POST /v1/audio/translations and returns the audio translated directly to English. This is a distinct endpoint from the transcribe-and-translate helper above.

1 from meshapi import MeshAPI, AudioTranslationsParams
2 
3 client = MeshAPI(base_url="https://api.meshapi.ai", token="rsk_...")
4 
5 with open("audio.mp3", "rb") as f:
6     file_bytes = f.read()
7 
8 result = client.audio.audio_translate(
9     file_bytes,
10     AudioTranslationsParams(
11         model="openai/whisper-large-v3",
12     ),
13     filename="audio.mp3",
14 )
15 
16 print(result.text)  # English translation

Async

1 from meshapi import AsyncMeshAPI, AudioTranslationsParams
2 
3 async with AsyncMeshAPI(base_url="https://api.meshapi.ai", token="rsk_...") as client:
4     with open("audio.mp3", "rb") as f:
5         file_bytes = f.read()
6 
7     result = await client.audio.audio_translate(
8         file_bytes,
9         AudioTranslationsParams(model="openai/whisper-large-v3"),
10         filename="audio.mp3",
11     )
12     print(result.text)

`AudioTranslationsParams` fields

Field	Type	Notes
`model`	`str`	Required. A translation-capable model (see the Models list).
`prompt`	`str \| None`	Optional context hint to guide the translation.
`response_format`	`str \| None`	`"json"`, `"text"`, or `"verbose_json"`.
`temperature`	`float \| None`	Sampling temperature (0–2).

The response .text field contains the English translation.

List Voices

client.audio.list_voices sends GET /v1/audio/voices.

1 from meshapi import ListVoicesParams
2 
3 voices = client.audio.list_voices(ListVoicesParams(page_size=10))
4 print(voices)

`ListVoicesParams` fields

Field	Type	Notes
`page_size`	`int \| None`	Results per page
`next_page_token`	`str \| None`	Pagination cursor
`search`	`str \| None`	Filter by name
`voice_type`	`str \| None`	`"standard"`, `"cloned"`, etc.
`category`	`str \| None`	Voice category filter

Get Voice

client.audio.get_voice sends GET /v1/audio/voices/{voice_id}.

1 voice = client.audio.get_voice("voice-id")
2 print(voice)

Audio

Text-to-Speech

Async

SpeechParams fields

Speech-to-Text (Transcription)

TranscriptionParams key fields

Translation

Translation (to English)

Async

AudioTranslationsParams fields

List Voices

ListVoicesParams fields

Get Voice

`SpeechParams` fields

`TranscriptionParams` key fields

`AudioTranslationsParams` fields

`ListVoicesParams` fields