Real-time browser conversation
Wire @spekoai/client into a web app — mint a session, join the transport, stream voice both ways.
@spekoai/client connects a browser tab to a Speko voice session over WebRTC. Your server mints a session (POST /v1/sessions) and the browser joins with the returned transport credentials. Keep SPEKO_API_KEY on your server and return only short-lived session credentials to the browser.
Install
npm install @spekoai/client1. Mint a session on your server
app.post('/api/conversations', async (req, res) => {
const session = await fetch('https://api.speko.dev/v1/sessions', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.SPEKO_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
mode: 'cascade',
intent: { language: 'en-US' },
systemPrompt: 'You are a helpful voice assistant.',
voice: undefined, // let routing pick
ttlSeconds: 900, // default
}),
}).then((r) => r.json());
res.json({
transportToken: session.transportToken,
transportUrl: session.transportUrl,
});
});Never expose your SPEKO_API_KEY to the browser. The session token is short-lived and scoped to one room.
2. Join from the browser
import { useEffect, useRef, useState } from 'react';
import { VoiceConversation } from '@spekoai/client';
export function VoicePanel() {
const convRef = useRef<VoiceConversation | null>(null);
const [status, setStatus] = useState('idle');
const [transcript, setTranscript] = useState<string[]>([]);
async function start() {
const { transportToken, transportUrl } = await fetch('/api/conversations', {
method: 'POST',
}).then((r) => r.json());
const conv = await VoiceConversation.create({
transportToken,
transportUrl,
onConnect: () => setStatus('connected'),
onDisconnect: () => setStatus('idle'),
onMessage: ({ source, text, isFinal }) => {
if (isFinal) setTranscript((t) => [...t, `${source}: ${text}`]);
},
onStatusChange: (s) => setStatus(s),
onError: (err) => console.error(err),
});
convRef.current = conv;
}
async function stop() {
await convRef.current?.endSession();
convRef.current = null;
}
useEffect(() => () => { void convRef.current?.endSession(); }, []);
return (
<div>
<button onClick={start} disabled={status !== 'idle'}>Start</button>
<button onClick={stop} disabled={status === 'idle'}>Stop</button>
<p>Status: {status}</p>
<ul>{transcript.map((t, i) => <li key={i}>{t}</li>)}</ul>
</div>
);
}That's the whole loop: mint → connect → talk → end.
What you can do mid-conversation
await conv.setMicMuted(true);
conv.setVolume(0.8);
conv.sendUserMessage('hello'); // text input as if spoken
conv.sendContextualUpdate('user navigated to checkout');sendContextualUpdate injects context the agent will see on its next turn without speaking it aloud — useful for app-state changes the agent should know about.
Mic / device control
@spekoai/client requests the mic with sensible defaults (echo cancel, noise suppression, AGC). Override per session:
await VoiceConversation.create({
transportToken,
transportUrl,
audioConstraints: {
echoCancellation: false,
noiseSuppression: false,
autoGainControl: false,
},
});What the SDK does not hide
- Long-lived API keys. Keep
SPEKO_API_KEYon your server. Browser code should only receive short-lived session credentials. - Reconnect / retry. A failed
connect()throwsSpekoClientError. Your UX decides whether to retry. - Tool calls / MCP / VAD streaming. Deferred.