transcribe
POST /v1/transcribe — speech-to-text with automatic provider routing.
Transcribe an audio payload. The router picks the best STT provider for your (language, region, optimizeFor) and fails over automatically.
const { text, provider, confidence } = await speko.transcribe(audio, {
language: 'es-MX',
});Signature
speko.transcribe(
audio: Uint8Array,
options: TranscribeOptions,
abortSignal?: AbortSignal,
): Promise<TranscribeResult>
speko.transcribeStream(
audio: Uint8Array,
options: TranscribeOptions,
abortSignal?: AbortSignal,
): AsyncIterable<TranscribeStreamEvent>Parameters
audio: Uint8Array
Raw audio bytes. Default MIME is audio/wav; set options.contentType if you send something else. Providers handle resampling and format conversion downstream — you don't have to match a specific sample rate.
options: TranscribeOptions
Extends RoutingIntent:
| Field | Type | Description |
|---|---|---|
language | string (BCP-47) | e.g. "en", "es-MX", "ja-JP". |
region | string? | Region to rank streaming providers in. Defaults to global server-side. |
optimizeFor | 'balanced' | 'accuracy' | 'latency' | 'cost'? | Bias the weighted score. Defaults to the server-side default (currently balanced). |
contentType | string? | MIME type for the body. Defaults to audio/wav. |
constraints | PipelineConstraints? | Allow-list constraints (see Types). |
keywords | readonly string[]? | Domain words and proper nouns to bias STT output toward. |
abortSignal?: AbortSignal
Cancel an in-flight request. Composed with the client-level timeout.
Returns
TranscribeResult
| Field | Type | Description |
|---|---|---|
text | string | Transcribed text. |
provider | string | Upstream provider that ran the request (e.g. deepgram, openai). |
model | string | Provider-specific model identifier. |
confidence | number | null | Model-reported confidence when available, else null. |
failoverCount | number | How many providers were tried before this one succeeded. |
scoresRunId | string | null | ID of the scoring run that selected this provider — useful for joining to benchmark data. |
Wire format
The SDK sends the audio as the raw HTTP body with:
Content-Type: fromoptions.contentType(defaultaudio/wav).X-Speko-Intent: JSON-serialized{ language, region?, optimizeFor? }.X-Speko-Constraints: JSON-serializedoptions.constraints(only if set).X-Speko-Stt-Options: JSON-serialized{ keywords }(only if keywords are set).
The wire response is text/event-stream with meta, transcript, done,
and error events. speko.transcribe() consumes that stream and returns the
final TranscribeResult; use speko.transcribeStream() to receive partial
transcripts directly.
Example: non-default MIME
import { readFile } from 'node:fs/promises';
const audio = await readFile('./call.ogg');
const result = await speko.transcribe(audio, {
language: 'en',
contentType: 'audio/ogg',
optimizeFor: 'accuracy',
});Example: restrict provider pool
const result = await speko.transcribe(audio, {
language: 'en',
constraints: {
allowedProviders: { stt: ['deepgram', 'assemblyai'] },
},
});The router still ranks candidates by benchmark score and only picks from the allow-list.
Example: bias proper nouns
const result = await speko.transcribe(audio, {
language: 'en',
keywords: ['Speko', 'Ava Martinez', 'Cartesia'],
});