Audio (TTS & STT)

View as Markdown

Audio

Text-to-Speech

client.audio.synthesize sends POST /v1/audio/speech and returns a Uint8Array of raw audio bytes.

1import { MeshAPI } from "meshapi-node-sdk";
2import { writeFileSync } from "fs";
3
4const client = new MeshAPI({ baseUrl: "https://api.meshapi.ai", token: "rsk_..." });
5
6const audio = await client.audio.synthesize({
7 input: "Hello from MeshAPI.",
8 model: "sarvam/bulbul:v2",
9 voice: "meera",
10});
11
12writeFileSync("output.wav", Buffer.from(audio));

SpeechParams fields

FieldTypeNotes
inputstringRequired. Text to synthesize.
modelstringRequired. e.g. "sarvam/bulbul:v2"
voicestring?Voice ID or name
response_formatstring?Audio format, e.g. "wav", "mp3"
speednumber?Playback speed multiplier

Speech-to-Text (Transcription)

client.audio.transcribe sends POST /v1/audio/transcriptions as a multipart upload and returns a TranscriptionResponse.

1import { readFileSync } from "fs";
2
3const fileBytes = readFileSync("audio.wav");
4
5const result = await client.audio.transcribe({
6 model: "sarvam/saaras:v3",
7 file: fileBytes,
8 file_name: "audio.wav",
9 language: "en",
10});
11
12console.log(result.text);

TranscriptionParams key fields

FieldTypeNotes
modelstringRequired. e.g. "sarvam/saaras:v3"
fileUint8Array | BufferRequired. Audio file bytes.
file_namestringRequired. Filename with extension.
languagestring?Language code, e.g. "en"
keytermsstring[]?Domain-specific terms to boost recognition
diarizeboolean?Enable speaker diarization
num_speakersnumber?Expected number of speakers
with_timestampsboolean?Include word-level timestamps

Translation

client.audio.translate sends POST /v1/audio/transcriptions/translate and returns the audio transcribed and translated to English.

1const result = await client.audio.translate({
2 model: "sarvam/saaras:v3",
3 file: fileBytes,
4 file_name: "audio.wav",
5});
6
7console.log(result.text);

List Voices

client.audio.listVoices sends GET /v1/audio/voices.

1const voices = await client.audio.listVoices({ page_size: 10 });
2console.log(voices);

ListVoicesParams fields

FieldTypeNotes
page_sizenumber?Results per page
next_page_tokenstring?Pagination cursor
searchstring?Filter by name
voice_typestring?"standard", "cloned", etc.
categorystring?Voice category filter

Get Voice

client.audio.getVoice sends GET /v1/audio/voices/{voice_id}.

1const voice = await client.audio.getVoice("voice-id");
2console.log(voice);