> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://developers.meshapi.ai/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://developers.meshapi.ai/_mcp/server.

# Audio (TTS & STT)

> Text-to-speech, speech-to-text, and voice management with the Python SDK.

# Audio

## Text-to-Speech

`client.audio.synthesize` sends `POST /v1/audio/speech` and returns raw audio bytes.

```python
from meshapi import MeshAPI, SpeechParams

client = MeshAPI(base_url="https://api.meshapi.ai", token="rsk_...")

audio_bytes = client.audio.synthesize(
    SpeechParams(
        input="Hello from MeshAPI.",
        model="sarvam/bulbul:v2",
        voice="meera",
    )
)

with open("output.wav", "wb") as f:
    f.write(audio_bytes)
```

### Async

```python
from meshapi import AsyncMeshAPI, SpeechParams

async with AsyncMeshAPI(base_url="https://api.meshapi.ai", token="rsk_...") as client:
    audio_bytes = await client.audio.synthesize(
        SpeechParams(
            input="Hello from MeshAPI.",
            model="sarvam/bulbul:v2",
        )
    )
```

### `SpeechParams` fields

| Field             | Type            | Notes                               |
| ----------------- | --------------- | ----------------------------------- |
| `input`           | `str`           | Required. Text to synthesize.       |
| `model`           | `str`           | Required. e.g. `"sarvam/bulbul:v2"` |
| `voice`           | `str \| None`   | Voice ID or name                    |
| `response_format` | `str \| None`   | Audio format, e.g. `"wav"`, `"mp3"` |
| `speed`           | `float \| None` | Playback speed multiplier           |

***

## Speech-to-Text (Transcription)

`client.audio.transcribe` sends `POST /v1/audio/transcriptions` as a multipart upload and returns a `TranscriptionResponse`.

```python
from meshapi import TranscriptionParams

with open("audio.wav", "rb") as f:
    file_bytes = f.read()

result = client.audio.transcribe(
    TranscriptionParams(
        model="sarvam/saaras:v3",
        file=file_bytes,
        file_name="audio.wav",
        language="en",
    )
)

print(result.text)
```

### `TranscriptionParams` key fields

| Field             | Type                | Notes                                      |
| ----------------- | ------------------- | ------------------------------------------ |
| `model`           | `str`               | Required. e.g. `"sarvam/saaras:v3"`        |
| `file`            | `bytes`             | Required. Audio file bytes.                |
| `file_name`       | `str`               | Required. Filename with extension.         |
| `language`        | `str \| None`       | Language code, e.g. `"en"`                 |
| `keyterms`        | `list[str] \| None` | Domain-specific terms to boost recognition |
| `diarize`         | `bool \| None`      | Enable speaker diarization                 |
| `num_speakers`    | `int \| None`       | Expected number of speakers                |
| `with_timestamps` | `bool \| None`      | Include word-level timestamps              |

***

## Translation

`client.audio.translate` sends `POST /v1/audio/transcriptions/translate` and returns the audio transcribed and translated to English.

```python
from meshapi import TranscriptionTranslateParams

with open("audio.wav", "rb") as f:
    file_bytes = f.read()

result = client.audio.translate(
    TranscriptionTranslateParams(
        model="sarvam/saaras:v3",
        file=file_bytes,
        file_name="audio.wav",
    )
)

print(result.text)
```

***

## List Voices

`client.audio.list_voices` sends `GET /v1/audio/voices`.

```python
from meshapi import ListVoicesParams

voices = client.audio.list_voices(ListVoicesParams(page_size=10))
print(voices)
```

### `ListVoicesParams` fields

| Field             | Type          | Notes                          |
| ----------------- | ------------- | ------------------------------ |
| `page_size`       | `int \| None` | Results per page               |
| `next_page_token` | `str \| None` | Pagination cursor              |
| `search`          | `str \| None` | Filter by name                 |
| `voice_type`      | `str \| None` | `"standard"`, `"cloned"`, etc. |
| `category`        | `str \| None` | Voice category filter          |

***

## Get Voice

`client.audio.get_voice` sends `GET /v1/audio/voices/{voice_id}`.

```python
voice = client.audio.get_voice("voice-id")
print(voice)
```