Architecture
- Browser uses
@spekoai/clientto join with a session token your server mints. - Your API server mints the token (
POST /v1/sessionsor your ownlivekit-server-sdkflow) and dispatches the agent worker. - Your agent worker (this guide) runs
@livekit/agentswith Speko-backed STT/LLM/TTS.
Install
@livekit/agents and @livekit/rtc-node are peers — pin the versions you actually run.
Worker entry
agent.ts
node agent.js (after build) or your tsx/bun setup of choice. The worker registers with LiveKit Cloud under agentName and waits for dispatches.
Per-session config from dispatch metadata
When your server creates a session, the dispatcher passes JSON metadata to the worker. Read it inentry to build pipeline-per-session:
Limitations of v1
- Non-streaming end-to-end. STT waits for end-of-utterance, LLM returns one chunk, TTS synthesizes a sentence at a time.
- No tool / function calls.
/v1/completedoesn’t expose tool invocation yet. - TTS format constraints. Cartesia (PCM) and WAV TTS work. ElevenLabs MP3 currently throws — pin a PCM-capable provider via
constraints.allowedProviders.ttsor rely on the router’s score-driven default. - STT input. Mono PCM16 frames; multi-channel throws.
@spekoai/adapter-livekit reference for the full surface.
Next
Browser side
Wire
@spekoai/client into your dashboard / web app.Adapter API
Full adapter reference.