Speko Docs

synthesize

POST /v1/synthesize — text-to-speech with automatic provider routing.

Synthesize text into audio. The router picks the best TTS provider for your (language, optimize_for) and fails over automatically.

speech = speko.synthesize(
    "Hello world",
    language="en",
)
Path("out.mp3").write_bytes(speech.audio)

Signature

Speko.synthesize(
    text: str,
    *,
    language: str,
    optimize_for: OptimizeFor | None = None,
    voice: str | None = None,
    speed: float | None = None,
    constraints: PipelineConstraints | dict | None = None,
) -> SynthesizeResult
await AsyncSpeko.synthesize(
    text: str,
    *,
    language: str,
    optimize_for: OptimizeFor | None = None,
    voice: str | None = None,
    speed: float | None = None,
    constraints: PipelineConstraints | dict | None = None,
) -> SynthesizeResult

Parameters

textstringrequired

The text to synthesize. No client-side length limit — the upstream provider applies its own.

languagestringrequired

BCP-47 language tag.

optimize_for'balanced' | 'accuracy' | 'latency' | 'cost'
voicestring

Voice id override. Interpreted per provider (e.g. a Cartesia voice UUID, an ElevenLabs voice id). Omit to use each provider's default.

speedfloat

Speech speed multiplier. Providers vary in accepted ranges — 1.0 is always neutral.

constraintsPipelineConstraints | dict

Allow-list constraints.

Returns — SynthesizeResult

audiobytes

Raw audio bytes. Format depends on the chosen provider — always branch on content_type.

content_typestring

MIME type. ElevenLabs returns audio/mpeg. Cartesia returns audio/pcm;rate=24000.

providerstring
modelstring
failover_countint

Providers tried before this one succeeded.

scores_run_idstring | None

Example — write to disk

from pathlib import Path

speech = speko.synthesize(
    "Welcome to the clinic.",
    language="en",
    voice="sonic-english",
)

ext = "mp3" if "mpeg" in speech.content_type else "pcm"
Path(f"greeting.{ext}").write_bytes(speech.audio)

Example — pin a PCM provider

from spekoai import AllowedProviders, PipelineConstraints

speech = speko.synthesize(
    "Hello",
    language="en",
    constraints=PipelineConstraints(
        allowed_providers=AllowedProviders(tts=["cartesia"]),
    ),
)
# speech.content_type == "audio/pcm;rate=24000"

Downstream consumers that only handle PCM (e.g. older LiveKit pipelines) should pin a PCM provider via constraints — or branch on content_type before decoding. MP3 from ElevenLabs will otherwise hit your decoder unexpectedly.

On this page