SpekoSTT
LiveKit Agents STT adapter backed by POST /v1/transcribe.
SpekoSTT is a stt.STT implementation. It encodes each utterance's audio frames into a WAV payload and uploads it to the Speko proxy. The router picks the best STT provider for your (language, region, optimizeFor) and handles failover.
import { SpekoSTT } from '@spekoai/adapter-livekit';
import { stt as sttNs } from '@livekit/agents';
const spekoSTT = new SpekoSTT({
speko,
intent: { language: 'en-US' },
});
const wrapped = new sttNs.StreamAdapter(spekoSTT, vad);Constructor
new SpekoSTT(options: SpekoSTTOptions)SpekoSTTOptions
| Field | Type | Required | Description |
|---|---|---|---|
speko | Speko | ✅ | @spekoai/sdk client. |
intent | Intent | ✅ | Validated at construction time. |
constraints | PipelineConstraints? | Allow-list constraints passed on every call. |
The constructor calls validateIntent(intent) — a broken routing hint throws here rather than deep inside the first transcription.
Properties
label = 'speko.STT'provider = 'speko'model = 'speko-router'streaming = false,interimResults = false
Streaming requirement
SpekoSTT.stream() throws because this adapter uploads one VAD-segmented WAV per
utterance. The /v1/transcribe response itself streams transcript events, and
speko.transcribe() aggregates the final result for this class. Wrap the
instance:
import { stt } from '@livekit/agents';
const adapter = new stt.StreamAdapter(spekoSTT, vad);Or use createSpekoComponents which does this for you.
Per-utterance flow
StreamAdapter+ VAD segment the user's audio into utterances.SpekoSTT._recognize(frame, abortSignal)is invoked for each utterance.- Frames are combined (
combineAudioFrames) and encoded into PCM16 mono WAV viaframesToWav. - The WAV is uploaded via
speko.transcribe()with the intent header and anyconstraints. - The result is emitted as a single
FINAL_TRANSCRIPTevent with confidence defaulting to1when the upstream provider doesn't report one.
Aborts propagate: when the session tears down, the AbortSignal passed by StreamAdapter cancels the in-flight HTTP request.
Mono-only
Multi-channel audio throws at the WAV-encode step:
SpekoSTT: expected mono audio (1 channel), got 2. …Configure your LiveKit AgentSession to pass mono audio, or pre-mix upstream.