@spekoai/adapter-livekit
LiveKit Agents adapter — route STT, LLM, and TTS through Speko.
@spekoai/adapter-livekit bridges a LiveKit Agents worker to the Speko proxy. Drop it into a standard agent entry file and the router picks the best STT, LLM, and TTS provider per call. Failover is server-side; you don't ship provider API keys.
Install
npm install @spekoai/sdk @spekoai/adapter-livekit \
@livekit/agents @livekit/agents-plugin-silero @livekit/rtc-node@livekit/agents and @livekit/rtc-node are peer dependencies — pin the versions you actually run against in your own package.json.
Quickstart
import {
type JobContext,
type JobProcess,
ServerOptions,
cli,
defineAgent,
voice,
} from '@livekit/agents';
import * as silero from '@livekit/agents-plugin-silero';
import { Speko } from '@spekoai/sdk';
import { createSpekoComponents } from '@spekoai/adapter-livekit';
import { fileURLToPath } from 'node:url';
const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
export default defineAgent({
prewarm: async (proc: JobProcess) => {
proc.userData.vad = await silero.VAD.load();
},
entry: async (ctx: JobContext) => {
const vad = ctx.proc.userData.vad as silero.VAD;
const { stt, llm, tts } = createSpekoComponents({
speko,
vad,
intent: { language: 'en-US', optimizeFor: 'balanced' },
});
const session = new voice.AgentSession({ vad, stt, llm, tts });
await session.start({
agent: new voice.Agent({
instructions: 'You are a helpful voice assistant. Be concise.',
}),
room: ctx.room,
});
await ctx.connect();
session.generateReply({ instructions: 'Greet the user and offer your assistance.' });
},
});
cli.runApp(
new ServerOptions({
agent: fileURLToPath(import.meta.url),
agentName: 'speko-demo',
}),
);Architecture
The adapter exports three @livekit/agents-compatible classes — SpekoSTT, SpekoLLM, SpekoTTS — and a convenience factory createSpekoComponents() that wraps STT and TTS with StreamAdapter helpers so Speko's streaming REST proxy can participate in a streaming voice.AgentSession:
SpekoSTTdeclares{ streaming: false }, so it must be wrapped withnew stt.StreamAdapter(spekoSTT, vad)to segment utterances with VAD before calling/v1/transcribe.SpekoTTSis sentence-bounded in LiveKit, so it is wrapped withnew tts.StreamAdapter(spekoTTS, sentenceTokenizer)before each streaming/v1/synthesizecall.SpekoLLMis used directly — it's allm.LLMbacked by streaming/v1/completeresponses.
createSpekoComponents handles the wrapping for you and returns { stt, llm, tts } ready to pass to voice.AgentSession.
v1 limitations
- STT request upload is utterance-bounded.
/v1/transcribestreams transcript events back, but this adapter still uploads one VAD-segmented WAV per utterance instead of full-duplex microphone audio. - TTS remains sentence-bounded in LiveKit.
/v1/synthesizestreams audio bytes; the adapter still calls it once per tokenized sentence. - Tool calls are supported. Inline tools return to the LiveKit runtime; registered webhook, builtin, and integration tools run server-side through
/v1/complete. - TTS output format. Accepts
audio/pcm;rate=NNNN(Cartesia) andaudio/wav. Throws onaudio/mpeg(ElevenLabs MP3) — pick a routing intent that prefers Cartesia, or pin a PCM-capable provider viaconstraints.allowedProviders.tts. - STT input format. Mono PCM16, encoded into a WAV wrapper per utterance. Multi-channel frames throw. Speko handles sample-rate conversion downstream — whatever the
AudioFramecarries is what's uploaded.
Reference
createSpekoComponents— convenience factory.SpekoSTT— STT class.SpekoLLM— LLM class.SpekoTTS— TTS class.Intent— routing hint type and validator.- Audio helpers — WAV encode/decode utilities.