Skip to main content
Synthesize text into audio. The router picks the best TTS provider for your (language, vertical, optimize_for) and fails over automatically.
speech = speko.synthesize(
    "Hello world",
    language="en",
    vertical="general",
)
Path("out.mp3").write_bytes(speech.audio)

Signature

Speko.synthesize(
    text: str,
    *,
    language: str,
    vertical: Vertical,
    optimize_for: OptimizeFor | None = None,
    voice: str | None = None,
    speed: float | None = None,
    constraints: PipelineConstraints | dict | None = None,
) -> SynthesizeResult

Parameters

text
string
required
The text to synthesize. No client-side length limit — the upstream provider applies its own.
language
string
required
BCP-47 language tag.
vertical
'general' | 'healthcare' | 'finance' | 'legal'
required
optimize_for
'balanced' | 'accuracy' | 'latency' | 'cost'
voice
string
Voice id override. Interpreted per provider (e.g. a Cartesia voice UUID, an ElevenLabs voice id). Omit to use each provider’s default.
speed
float
Speech speed multiplier. Providers vary in accepted ranges — 1.0 is always neutral.
constraints
PipelineConstraints | dict
Allow-list constraints.

Returns — SynthesizeResult

audio
bytes
Raw audio bytes. Format depends on the chosen provider — always branch on content_type.
content_type
string
MIME type. ElevenLabs returns audio/mpeg. Cartesia returns audio/pcm;rate=24000.
provider
string
model
string
failover_count
int
Providers tried before this one succeeded.
scores_run_id
string | None

Example — write to disk

from pathlib import Path

speech = speko.synthesize(
    "Welcome to the clinic.",
    language="en",
    vertical="healthcare",
    voice="sonic-english",
)

ext = "mp3" if "mpeg" in speech.content_type else "pcm"
Path(f"greeting.{ext}").write_bytes(speech.audio)

Example — pin a PCM provider

from spekoai import AllowedProviders, PipelineConstraints

speech = speko.synthesize(
    "Hello",
    language="en",
    vertical="general",
    constraints=PipelineConstraints(
        allowed_providers=AllowedProviders(tts=["cartesia"]),
    ),
)
# speech.content_type == "audio/pcm;rate=24000"
Downstream consumers that only handle PCM (e.g. older LiveKit pipelines) should pin a PCM provider via constraints — or branch on content_type before decoding. MP3 from ElevenLabs will otherwise hit your decoder unexpectedly.