Speko Docs

transcribe

POST /v1/transcribe — speech-to-text with automatic provider routing.

Transcribe an audio payload. The router picks the best STT provider for your (language, optimize_for) and fails over automatically.

result = speko.transcribe(
    audio_bytes,
    language="es-MX",
)
print(result.text, result.provider, result.confidence)

Signature

Speko.transcribe(
    audio: bytes,
    *,
    language: str,
    optimize_for: OptimizeFor | None = None,
    content_type: str = "audio/wav",
    constraints: PipelineConstraints | dict | None = None,
) -> TranscribeResult
await AsyncSpeko.transcribe(
    audio: bytes,
    *,
    language: str,
    optimize_for: OptimizeFor | None = None,
    content_type: str = "audio/wav",
    constraints: PipelineConstraints | dict | None = None,
) -> TranscribeResult

Parameters

audiobytesrequired

Raw audio bytes. Providers handle resampling and format conversion — any sample rate works. Wrap bytearray / memoryview with bytes(...) at the call site.

languagestringrequired

BCP-47 language tag, e.g. "en", "es-MX", "ja-JP".

optimize_for'balanced' | 'accuracy' | 'latency' | 'cost'

Preset that biases the weighted score. Server default is balanced.

content_typestringdefault: audio/wav

MIME type for the request body.

constraintsPipelineConstraints | dict

Allow-list constraints. The router still ranks by score but only considers listed providers.

Returns — TranscribeResult

textstring

Transcribed text.

providerstring

Upstream provider that ran the request.

modelstring

Provider-specific model identifier.

confidencefloat | None

Model-reported confidence when available.

failover_countint

Number of providers tried before this one succeeded.

scores_run_idstring | None

ID of the scoring run that selected this provider — useful for joining to benchmark data.

Example — non-default MIME + allow-list

from pathlib import Path

from spekoai import AllowedProviders, PipelineConstraints, Speko

speko = Speko(api_key="sk_live_...")

audio = Path("call.ogg").read_bytes()
result = speko.transcribe(
    audio,
    language="en",
    optimize_for="accuracy",
    content_type="audio/ogg",
    constraints=PipelineConstraints(
        allowed_providers=AllowedProviders(stt=["deepgram", "assemblyai"]),
    ),
)

Wire format

The audio ships as the raw HTTP body. The routing intent and constraints travel in two headers so no server-side re-parsing of the body is needed:

  • Content-Type: value of content_type (default audio/wav).
  • X-Speko-Intent: compact JSON {"language", "optimizeFor"?}.
  • X-Speko-Constraints: compact JSON when constraints is set.

The response is a JSON TranscribeResult.

On this page