Create Speech

View as Markdown

Convert text to speech.

Provider is resolved automatically from the model name via the model_pricing table — no hardcoded provider branching. Streaming is used when the provider adapter supports it and stream=true (the default).

The voice field is required for ElevenLabs models. Sarvam models use speaker instead.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Request

This endpoint expects an object.
inputstringRequired
modelstringOptionalDefaults to elevenlabs/eleven_turbo_v2_5
voicestring or nullOptional
streambooleanOptionalDefaults to true
response_formatstring or nullOptional
language_codestring or nullOptional
voice_settingsobject or nullOptional
pronunciation_dictionary_locatorslist of objects or nullOptional
seedinteger or nullOptional
previous_textstring or nullOptional
next_textstring or nullOptional
previous_request_idslist of strings or nullOptional
next_request_idslist of strings or nullOptional
apply_text_normalizationstring or nullOptional
apply_language_text_normalizationboolean or nullOptional
use_pvc_as_ivcboolean or nullOptional
enable_loggingboolean or nullOptional
optimize_streaming_latencyinteger or nullOptional
speakerstringOptionalDefaults to anushka
target_language_codestringOptionalDefaults to hi-IN
pitchdouble or nullOptional
pacedouble or nullOptional
loudnessdouble or nullOptional
speech_sample_rateinteger or nullOptional
enable_preprocessingboolean or nullOptional

Response

Successful Response

Errors

422
Unprocessable Entity Error