For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocsAPI ReferenceSDKs
DocsAPI ReferenceSDKs
  • Introduction
    • Product Overview
    • Pricing
    • Model Explanation
    • Available Models
  • Guides
    • Quickstart
    • Authentication
    • BYOK
    • Dashboard Guide
    • Prompt Templates
    • Embeddings
    • RAG (Files & Search)
    • Audio
    • Images & Vision
    • Image Generation
    • Compare
    • Batch API
    • Auto Routing
    • Realtime Audio
  • SDKs
    • Node.js (TypeScript)
    • Python
    • Go
  • Infrastructure
    • Architecture
LogoLogo
On this page
  • Audio input
  • Audio output
  • SDK coverage
  • Notes
Guides

Audio

||View as Markdown|
Was this page helpful?
Edit this page
Previous

RAG (Retrieval-Augmented Generation)

Next

Images & Vision

Built with

Mesh API supports audio through POST /v1/chat/completions.

Use this page for:

  • audio input with input_audio
  • audio output with modalities and audio

Audio input

Send audio as a content part inside a chat message.

curl
Node.js SDK
Python SDK
Go SDK
Java SDK
$curl https://api.meshapi.ai/v1/chat/completions \
> -H "Authorization: Bearer <YOUR_RSK_KEY>" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "google/gemini-3-flash-preview",
> "messages": [
> {
> "role": "user",
> "content": [
> { "type": "text", "text": "Transcribe this clip." },
> {
> "type": "input_audio",
> "input_audio": {
> "data": "<BASE64_AUDIO>",
> "format": "wav"
> }
> }
> ]
> }
> ]
> }'

Audio output

Request text and audio together when the model supports audio output.

$curl https://api.meshapi.ai/v1/chat/completions \
> -H "Authorization: Bearer <YOUR_RSK_KEY>" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "openai/gpt-4o-audio-preview",
> "messages": [
> { "role": "user", "content": "Read this back to me in a calm voice." }
> ],
> "modalities": ["text", "audio"],
> "audio": {
> "voice": "alloy",
> "format": "wav"
> }
> }'

The same request shape is available through all four SDKs by setting chat-completions fields for modalities and audio.

SDK coverage

  • Node: client.chat.completions.create(...)
  • Python: client.chat.completions.create(...)
  • Go: client.Chat.Completions.Create(...)
  • Java: client.chat().completions().create(...)

Notes

  • Audio payloads are base64 encoded in the request body.
  • Check GET /v1/models to find models that accept or produce audio.
  • Keep payload sizes reasonable, especially for browser-based clients.