Real-time browser conversation

Wire @spekoai/client into a web app — mint a session, join the transport, stream voice both ways.

@spekoai/client connects a browser tab to a Speko voice session over WebRTC. Your server mints a session (POST /v1/sessions) and the browser joins with the returned transport credentials. Keep SPEKO_API_KEY on your server and return only short-lived session credentials to the browser.

Install

npm install @spekoai/client

1. Mint a session on your server

app.post('/api/conversations', async (req, res) => {
  const session = await fetch('https://api.speko.dev/v1/sessions', {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${process.env.SPEKO_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      mode: 'cascade',
      intent: { language: 'en-US' },
      systemPrompt: 'You are a helpful voice assistant.',
      voice: undefined,                  // let routing pick
      ttlSeconds: 900,                   // default
    }),
  }).then((r) => r.json());

  res.json({
    transportToken: session.transportToken,
    transportUrl: session.transportUrl,
  });
});

Never expose your SPEKO_API_KEY to the browser. The session token is short-lived and scoped to one room.

2. Join from the browser

import { useEffect, useRef, useState } from 'react';
import { VoiceConversation } from '@spekoai/client';

export function VoicePanel() {
  const convRef = useRef<VoiceConversation | null>(null);
  const [status, setStatus] = useState('idle');
  const [transcript, setTranscript] = useState<string[]>([]);

  async function start() {
    const { transportToken, transportUrl } = await fetch('/api/conversations', {
      method: 'POST',
    }).then((r) => r.json());

    const conv = await VoiceConversation.create({
      transportToken,
      transportUrl,
      onConnect: () => setStatus('connected'),
      onDisconnect: () => setStatus('idle'),
      onMessage: ({ source, text, isFinal }) => {
        if (isFinal) setTranscript((t) => [...t, `${source}: ${text}`]);
      },
      onStatusChange: (s) => setStatus(s),
      onError: (err) => console.error(err),
    });

    convRef.current = conv;
  }

  async function stop() {
    await convRef.current?.endSession();
    convRef.current = null;
  }

  useEffect(() => () => { void convRef.current?.endSession(); }, []);

  return (
    <div>
      <button onClick={start} disabled={status !== 'idle'}>Start</button>
      <button onClick={stop} disabled={status === 'idle'}>Stop</button>
      <p>Status: {status}</p>
      <ul>{transcript.map((t, i) => <li key={i}>{t}</li>)}</ul>
    </div>
  );
}

That's the whole loop: mint → connect → talk → end.

What you can do mid-conversation

await conv.setMicMuted(true);
conv.setVolume(0.8);
conv.sendUserMessage('hello');                 // text input as if spoken
conv.sendContextualUpdate('user navigated to checkout');

sendContextualUpdate injects context the agent will see on its next turn without speaking it aloud — useful for app-state changes the agent should know about.

Mic / device control

@spekoai/client requests the mic with sensible defaults (echo cancel, noise suppression, AGC). Override per session:

await VoiceConversation.create({
  transportToken,
  transportUrl,
  audioConstraints: {
    echoCancellation: false,
    noiseSuppression: false,
    autoGainControl: false,
  },
});

What the SDK does not hide

Long-lived API keys. Keep SPEKO_API_KEY on your server. Browser code should only receive short-lived session credentials.
Reconnect / retry. A failed connect() throws SpekoClientError. Your UX decides whether to retry.
Tool calls / MCP / VAD streaming. Deferred.

Client API

Full @spekoai/client reference.

Build the agent worker

Worker side of the same architecture.