@spekoai/client
Browser SDK for real-time voice conversations.
@spekoai/client is the browser-side companion to @spekoai/sdk. It connects a browser tab to a Speko voice session: capture the user's microphone, play the agent's audio, and exchange structured events such as transcripts and status changes.
Your server must mint a short-lived session token and return only the browser-safe session credentials. Never expose a Speko API key to browser code. For VoiceConversation, audio flows through Speko's browser media transport after the token is minted. For RealtimeVoiceConversation, audio flows browser ↔ Speko's S2S WebSocket proxy.
Install
npm install @spekoai/client
# or
pnpm add @spekoai/clientThe package does not expose low-level media transport types on its public surface, so most apps only import from @spekoai/client directly.
Quick start
1. Server mints a session
// server side — using @spekoai/sdk or raw fetch
const session = await fetch('/v1/sessions', { ... });
// returns { transportToken, transportUrl, roomName, identity, expiresAt }See Build a voice agent for the worker side and Real-time browser conversation for the end-to-end browser flow.
2. Browser joins the room
import { VoiceConversation } from '@spekoai/client';
const conversation = await VoiceConversation.create({
transportToken, // from server
transportUrl, // from server
onConnect: ({ conversationId }) => console.log('connected', conversationId),
onDisconnect: ({ reason }) => console.log('disconnected', reason),
onMessage: ({ source, text, isFinal }) =>
console.log(source, text, isFinal),
onStatusChange: (status) => console.log('status', status),
onModeChange: (mode) => console.log('mode', mode),
onError: (err) => console.error(err),
});
await conversation.setMicMuted(true);
conversation.setVolume(0.8);
conversation.sendUserMessage('hello');
conversation.sendContextualUpdate('user switched to the checkout page');
await conversation.endSession();What the SDK owns
- Connecting with supplied short-lived session credentials.
- Acquiring the microphone with sensible constraints (echo cancellation, noise suppression, auto gain — all togglable via
audioConstraints). - Playing remote audio.
- Parsing inbound data-channel packets (transcripts, agent messages) and invoking your callbacks.
- Sending outbound packets — overrides, user messages, contextual updates.
- Mic mute, speaker volume, output device selection.
- Tearing everything down on disconnect, including releasing the OS microphone capture.
What it doesn't do
- Mint sessions from API keys. Keep
SPEKO_API_KEYon your server. Browser code should only receive short-lived session tokens. - Retries. A failed
connect()throws aSpekoClientError. Retry logic belongs in your app's UX. - Tool calls, guardrail hooks, MCP, VAD score streaming. Deferred — see the package's
ROADMAP.md.
Reference
- VoiceConversation — the primary API surface.
- RealtimeVoiceConversation — browser capture/playback for S2S WebSocket sessions.
- Callbacks & events — every hook the SDK exposes.
- Data channel protocol — wire format for inbound / outbound packets.
- Errors —
SpekoClientErrorand its codes.