transcribe

Transcribe an audio payload. The router picks the best STT provider for your (language, region, optimizeFor) and fails over automatically.

const { text, provider, confidence } = await speko.transcribe(audio, {
  language: 'es-MX',
});

Signature

speko.transcribe(
  audio: Uint8Array,
  options: TranscribeOptions,
  abortSignal?: AbortSignal,
): Promise<TranscribeResult>

speko.transcribeStream(
  audio: Uint8Array,
  options: TranscribeOptions,
  abortSignal?: AbortSignal,
): AsyncIterable<TranscribeStreamEvent>

Raw audio bytes. Default MIME is audio/wav; set options.contentType if you send something else. Providers handle resampling and format conversion downstream — you don't have to match a specific sample rate.

`options: TranscribeOptions`

Extends RoutingIntent:

Field	Type	Description
`language`	`string` (BCP-47)	e.g. `"en"`, `"es-MX"`, `"ja-JP"`.
`region`	`string?`	Region to rank streaming providers in. Defaults to `global` server-side.
`optimizeFor`	`'balanced' \| 'accuracy' \| 'latency' \| 'cost'?`	Bias the weighted score. Defaults to the server-side default (currently `balanced`).
`contentType`	`string?`	MIME type for the body. Defaults to `audio/wav`.
`constraints`	`PipelineConstraints?`	Allow-list constraints (see Types).
`keywords`	`readonly string[]?`	Domain words and proper nouns to bias STT output toward.

`abortSignal?: AbortSignal`

Cancel an in-flight request. Composed with the client-level timeout.

Returns

`TranscribeResult`

Field	Type	Description
`text`	`string`	Transcribed text.
`provider`	`string`	Upstream provider that ran the request (e.g. `deepgram`, `openai`).
`model`	`string`	Provider-specific model identifier.
`confidence`	`number \| null`	Model-reported confidence when available, else `null`.
`failoverCount`	`number`	How many providers were tried before this one succeeded.
`scoresRunId`	`string \| null`	ID of the scoring run that selected this provider — useful for joining to benchmark data.

Wire format

The SDK sends the audio as the raw HTTP body with:

Content-Type: from options.contentType (default audio/wav).
X-Speko-Intent: JSON-serialized { language, region?, optimizeFor? }.
X-Speko-Constraints: JSON-serialized options.constraints (only if set).
X-Speko-Stt-Options: JSON-serialized { keywords } (only if keywords are set).

The wire response is text/event-stream with meta, transcript, done, and error events. speko.transcribe() consumes that stream and returns the final TranscribeResult; use speko.transcribeStream() to receive partial transcripts directly.

Example: non-default MIME

import { readFile } from 'node:fs/promises';

const audio = await readFile('./call.ogg');
const result = await speko.transcribe(audio, {
  language: 'en',
  contentType: 'audio/ogg',
  optimizeFor: 'accuracy',
});

Example: restrict provider pool

const result = await speko.transcribe(audio, {
  language: 'en',
  constraints: {
    allowedProviders: { stt: ['deepgram', 'assemblyai'] },
  },
});

The router still ranks candidates by benchmark score and only picks from the allow-list.

Example: bias proper nouns

const result = await speko.transcribe(audio, {
  language: 'en',
  keywords: ['Speko', 'Ava Martinez', 'Cartesia'],
});