@spekoai/client

@spekoai/client is the browser-side companion to @spekoai/sdk. It connects a browser tab to a Speko voice session: capture the user's microphone, play the agent's audio, and exchange structured events such as transcripts and status changes.

Your server must mint a short-lived session token and return only the browser-safe session credentials. Never expose a Speko API key to browser code. For VoiceConversation, audio flows through Speko's browser media transport after the token is minted. For RealtimeVoiceConversation, audio flows browser ↔ Speko's S2S WebSocket proxy.

Install

npm install @spekoai/client
# or
pnpm add @spekoai/client

The package does not expose low-level media transport types on its public surface, so most apps only import from @spekoai/client directly.

Quick start

1. Server mints a session

// server side — using @spekoai/sdk or raw fetch
const session = await fetch('/v1/sessions', { ... });
// returns { transportToken, transportUrl, roomName, identity, expiresAt }

See Build a voice agent for the worker side and Real-time browser conversation for the end-to-end browser flow.

2. Browser joins the room

import { VoiceConversation } from '@spekoai/client';

const conversation = await VoiceConversation.create({
  transportToken,   // from server
  transportUrl,     // from server

  onConnect: ({ conversationId }) => console.log('connected', conversationId),
  onDisconnect: ({ reason }) => console.log('disconnected', reason),
  onMessage: ({ source, text, isFinal }) =>
    console.log(source, text, isFinal),
  onStatusChange: (status) => console.log('status', status),
  onModeChange: (mode) => console.log('mode', mode),
  onError: (err) => console.error(err),
});

await conversation.setMicMuted(true);
conversation.setVolume(0.8);
conversation.sendUserMessage('hello');
conversation.sendContextualUpdate('user switched to the checkout page');

await conversation.endSession();

What the SDK owns

Connecting with supplied short-lived session credentials.
Acquiring the microphone with sensible constraints (echo cancellation, noise suppression, auto gain — all togglable via audioConstraints).
Playing remote audio.
Parsing inbound data-channel packets (transcripts, agent messages) and invoking your callbacks.
Sending outbound packets — overrides, user messages, contextual updates.
Mic mute, speaker volume, output device selection.
Tearing everything down on disconnect, including releasing the OS microphone capture.

What it doesn't do

Mint sessions from API keys. Keep SPEKO_API_KEY on your server. Browser code should only receive short-lived session tokens.
Retries. A failed connect() throws a SpekoClientError. Retry logic belongs in your app's UX.
Tool calls, guardrail hooks, MCP, VAD score streaming. Deferred — see the package's ROADMAP.md.

Reference

VoiceConversation — the primary API surface.
RealtimeVoiceConversation — browser capture/playback for S2S WebSocket sessions.
Callbacks & events — every hook the SDK exposes.
Data channel protocol — wire format for inbound / outbound packets.
Errors — SpekoClientError and its codes.