# Speko (/)
One API for every voice provider. Speko benchmarks STT, LLM, and TTS in real time and routes each call to the best provider for your language and latency/cost target.
Speko is a voice gateway. You ship one integration; we route each request to the highest-scoring provider for your `(language, region, optimizeFor)` intent. Failover is server-side. Providers rotate without a code change.
## Two ways to integrate [#two-ways-to-integrate]
Call `/v1/transcribe`, `/v1/synthesize`, `/v1/complete` directly. Best for batch jobs and server pipelines.
Real-time voice in the browser. Mic in, agent voice out, transcripts on the data channel.
Inbound and outbound PSTN calls with lifecycle webhooks, reports, recordings, and transfers.
## Start here [#start-here]
Sign up, mint an API key, make your first transcribe call.
Intent, scoring, failover. The model behind every Speko call.
Use your own provider credentials. Speko routes; you pay providers directly.
Every `/v1/*` endpoint, request/response shape, headers.
## SDKs [#sdks]
TypeScript HTTP client.
Browser SDK for real-time voice.
Async + sync Python client.
Drop-in STT/LLM/TTS for your existing voice agent framework.
# Audio helpers (/adapter-livekit/audio)
WAV encode / decode and MIME parsing utilities.
The adapter exports the three audio helpers it uses internally. They're stable exports — safe to reuse if you're building custom pipelines or writing tests.
```ts
import {
framesToWav,
parseWav,
pcmSampleRateFromContentType,
} from '@spekoai/adapter-livekit';
```
## `framesToWav` [#framestowav]
```ts
function framesToWav(buffer: AudioBuffer): Uint8Array;
```
Encode one or more LiveKit `AudioFrame`s (or an array) into a PCM16 mono WAV byte stream. Used by `SpekoSTT` to wrap each utterance before uploading to `/v1/transcribe`.
* Combines frames via `combineAudioFrames` from `@livekit/rtc-node`.
* Writes a standard 44-byte RIFF/WAVE header: `fmt ` chunk (PCM, 16-bit, mono, `sampleRate` from frames) + `data` chunk.
* Sample rate is pulled from the input frames — whatever LiveKit gives you is what's encoded.
**Mono-only.** A multi-channel `AudioBuffer` throws:
```
SpekoSTT: expected mono audio (1 channel), got 2. Configure your LiveKit AgentSession to pass mono audio or pre-mix upstream of the STT.
```
## `parseWav` [#parsewav]
```ts
function parseWav(bytes: Uint8Array): {
pcm: Uint8Array;
sampleRate: number;
channels: number;
};
```
Minimal PCM16 WAV parser. Used by `SpekoTTS` to unwrap WAV-encoded proxy responses into raw samples for `AudioByteStream`.
Accepted subset:
* Valid `RIFF` / `WAVE` header.
* `fmt ` chunk present and of `format = 1` (PCM).
* 16-bit samples.
* `data` chunk reachable by walking subsequent chunks (tolerates e.g. `LIST` chunks between `fmt ` and `data`).
Anything outside this subset throws a descriptive error. `channels` is returned as-is — the caller is responsible for deciding whether stereo is acceptable. `SpekoTTS` currently throws on stereo.
## `pcmSampleRateFromContentType` [#pcmsampleratefromcontenttype]
```ts
function pcmSampleRateFromContentType(
contentType: string,
fallback: number,
): number;
```
Parse the `rate` parameter out of a Cartesia-style content type:
```ts
pcmSampleRateFromContentType('audio/pcm;rate=24000', 16_000); // 24000
pcmSampleRateFromContentType('audio/pcm', 16_000); // 16000
pcmSampleRateFromContentType('audio/pcm;rate=abc', 16_000); // 16000
```
Falls back when the rate is missing, zero, or unparseable. Case-insensitive on `rate=`.
## Intended usage [#intended-usage]
You shouldn't need these helpers when consuming the adapter through [`createSpekoComponents`](/adapter-livekit/create-speko-components) — they're used internally by `SpekoSTT` and `SpekoTTS`. They're exported for:
* **Unit tests** — build canned WAV fixtures with `framesToWav`, round-trip them through `parseWav`.
* **Custom STT / TTS pipelines** that need to reuse the same WAV framing Speko uses.
* **Debugging** — decode what an upstream provider returned without instantiating a full TTS.
# createSpekoComponents (/adapter-livekit/create-speko-components)
Build a { stt, llm, tts } bundle ready for voice.AgentSession.
`createSpekoComponents` is the one-call wiring helper for `voice.AgentSession`. It constructs `SpekoSTT`, `SpekoLLM`, `SpekoTTS` from a single options object and wraps STT and TTS with LiveKit's `StreamAdapter` so Speko's streaming REST proxy can drive a streaming session.
```ts
import { createSpekoComponents } from '@spekoai/adapter-livekit';
const { stt, llm, tts } = createSpekoComponents({
speko,
vad,
intent: { language: 'en-US', optimizeFor: 'balanced' },
});
const session = new voice.AgentSession({ vad, stt, llm, tts });
```
## Signature [#signature]
```ts
function createSpekoComponents(
options: CreateSpekoComponentsOptions,
): SpekoComponents;
```
## `CreateSpekoComponentsOptions` [#createspekocomponentsoptions]
| Field | Type | Required | Description |
| ------------------------ | ----------------------------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `speko` | `Speko` | ✅ | Initialised `@spekoai/sdk` client. |
| `intent` | [`Intent`](/adapter-livekit/intent) | ✅ | Routing hint shared by STT, LLM, and TTS. |
| `vad` | `VAD` | ✅ | VAD instance used by the `stt.StreamAdapter`. Typically `await silero.VAD.load()`. |
| `voice` | `string?` | | Voice id passed to `SpekoTTS` (maps to the Speko proxy's `voice` param). |
| `constraints` | `PipelineConstraints?` | | Allow-list constraints applied to all three modalities. |
| `sentenceTokenizer` | `tokenize.SentenceTokenizer?` | | Tokenizer for chunking LLM output before TTS. Defaults to `tokenize.basic.SentenceTokenizer`. |
| `llm` | `{ temperature?, maxTokens? }?` | | Tuning forwarded to `/v1/complete`. |
| `ttsOptions` | `{ sampleRate?, speed? }?` | | Output sample rate and speech speed forwarded to `SpekoTTS`. |
| `agentId` | `string?` | | Enables the [registered-tools loader](/guides/tool-calling). When set, the adapter calls `speko.agents.tools.listChatTools(agentId)` once per session — using the `speko` client you pass for auth and base URL — and merges the result with LiveKit's runtime `ToolContext`. Registered tools win on name collision. Omit to keep runtime-only behavior. |
| `apiBaseUrl` | `string?` | | **Deprecated and ignored** — the loader reads the base URL from the `speko` client. Safe to omit. |
| `apiKey` | `string?` | | **Deprecated and ignored** — the loader reads the API key from the `speko` client. Safe to omit. |
| `onRegisteredToolsError` | `(err: Error) => void?` | | Called once if the registered-tools fetch fails. Voice session keeps running with runtime-only tools — this is a soft degradation, not a crash. |
## Registered tools [#registered-tools]
When `agentId` is set, `createSpekoComponents` constructs a `RegisteredToolsLoader` for the underlying `SpekoLLM`. The loader lazily calls `speko.agents.tools.listChatTools(agentId)` on the first `chat()` of each session — reusing the `Speko` client you pass for auth and base URL — and caches the result for the LLM's lifetime. Voice sessions live for seconds-to-minutes and `chat()` is called many times — re-fetching every turn would be wasteful. (`apiBaseUrl`/`apiKey` are deprecated and ignored; the `speko` client carries both.)
On collision with a runtime tool of the same name, the registered tool wins (it's the customer's authoritative declaration). Fetch failures are non-fatal — the loader returns `undefined` and the agent continues with runtime tools only, calling `onRegisteredToolsError` once.
`listChatTools` returns every source kind — `inline`, `webhook`, `builtin`, and `integration` — already in the `ChatTool[]` shape `/v1/complete` accepts.
See the [tool calling guide](/guides/tool-calling) for the full picture.
## Returns — `SpekoComponents` [#returns--spekocomponents]
```ts
interface SpekoComponents {
stt: stt.StreamAdapter; // wraps SpekoSTT + vad
llm: SpekoLLM; // used directly
tts: tts.StreamAdapter; // wraps SpekoTTS + sentenceTokenizer
}
```
Drop the returned object straight into a `voice.AgentSession`.
## Custom sentence tokenizer [#custom-sentence-tokenizer]
```ts
import { tokenize } from '@livekit/agents';
const { stt, llm, tts } = createSpekoComponents({
speko,
vad,
intent,
sentenceTokenizer: new tokenize.basic.SentenceTokenizer({ minSentenceLength: 20 }),
});
```
Use a longer minimum sentence length if you want fewer, longer TTS calls at the cost of latency before the first audio chunk.
## Constraints shared across modalities [#constraints-shared-across-modalities]
```ts
createSpekoComponents({
speko,
vad,
intent: { language: 'en' },
constraints: {
allowedProviders: {
stt: ['deepgram'],
llm: ['anthropic'],
tts: ['cartesia'],
},
},
});
```
Every underlying call (`/v1/transcribe`, `/v1/complete`, `/v1/synthesize`) receives the same constraints object.
## Opting out — use classes directly [#opting-out--use-classes-directly]
If you need finer control, construct the classes yourself. `createSpekoComponents` is a convenience wrapper; nothing stops you from building the pipeline manually.
```ts
import { SpekoSTT, SpekoLLM, SpekoTTS } from '@spekoai/adapter-livekit';
import { stt, tts, tokenize } from '@livekit/agents';
const spekoSTT = new SpekoSTT({ speko, intent });
const wrappedSTT = new stt.StreamAdapter(spekoSTT, vad);
const spekoLLM = new SpekoLLM({ speko, intent, temperature: 0.7 });
const spekoTTS = new SpekoTTS({ speko, intent, voice: 'sonic-english' });
const wrappedTTS = new tts.StreamAdapter(spekoTTS, new tokenize.basic.SentenceTokenizer());
```
# Intent (/adapter-livekit/intent)
Routing hint type and construction-time validator.
`Intent` is the routing hint every adapter class takes. It's a re-export of `RoutingIntent` from `@spekoai/sdk`, so anything you already have typed as a `RoutingIntent` passes through without conversion.
```ts
import type { Intent, OptimizeFor } from '@spekoai/adapter-livekit';
```
## Type [#type]
```ts
type Intent = {
language: string; // BCP-47
region?: string; // e.g. "global", "us-east4", "europe-west3"
optimizeFor?: 'balanced' | 'accuracy' | 'latency' | 'cost';
};
```
## `validateIntent(intent)` [#validateintentintent]
Throws a descriptive `Error` when the intent is malformed. Called by every adapter class constructor, so a bad intent fails at construction time rather than deep inside the first STT / LLM / TTS call.
```ts
import { validateIntent } from '@spekoai/adapter-livekit';
validateIntent({ language: 'en-US' });
// ok
validateIntent({ language: '' });
// throws: SpekoAdapter: intent.language is required (BCP-47 tag)
validateIntent({ language: 'en', optimizeFor: 'speed' as any });
// throws: SpekoAdapter: unknown optimizeFor "speed". Expected one of: balanced, accuracy, latency, cost.
```
Validation rules:
* `language` must be a non-empty string.
* `region`, if set, is forwarded to Speko for region-aware latency ranking.
* `optimizeFor`, if set, must be one of `balanced`, `accuracy`, `latency`, `cost`.
No BCP-47 syntactic validation beyond "is a non-empty string" — the router accepts short codes (`en`) and region-tagged codes (`es-MX`) and normalises downstream.
## Sharing one intent [#sharing-one-intent]
The adapter pattern is "one intent per agent session, shared across modalities":
```ts
const intent: Intent = { language: 'en-US', region: 'global', optimizeFor: 'latency' };
const { stt, llm, tts } = createSpekoComponents({ speko, vad, intent });
```
If you need per-modality divergence (e.g. latency-optimised STT with cost-optimised TTS), construct the classes directly:
```ts
const sttAdapter = new SpekoSTT({ speko, intent: { ...intent, optimizeFor: 'latency' } });
const ttsAdapter = new SpekoTTS({ speko, intent: { ...intent, optimizeFor: 'cost' } });
```
# @spekoai/adapter-livekit (/adapter-livekit/overview)
LiveKit Agents adapter — route STT, LLM, and TTS through Speko.
`@spekoai/adapter-livekit` bridges a [LiveKit Agents](https://docs.livekit.io/agents/) worker to the Speko proxy. Drop it into a standard agent entry file and the router picks the best STT, LLM, and TTS provider per call. Failover is server-side; you don't ship provider API keys.
## Install [#install]
```sh
npm install @spekoai/sdk @spekoai/adapter-livekit \
@livekit/agents @livekit/agents-plugin-silero @livekit/rtc-node
```
`@livekit/agents` and `@livekit/rtc-node` are peer dependencies — pin the versions you actually run against in your own `package.json`.
## Quickstart [#quickstart]
```ts
import {
type JobContext,
type JobProcess,
ServerOptions,
cli,
defineAgent,
voice,
} from '@livekit/agents';
import * as silero from '@livekit/agents-plugin-silero';
import { Speko } from '@spekoai/sdk';
import { createSpekoComponents } from '@spekoai/adapter-livekit';
import { fileURLToPath } from 'node:url';
const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
export default defineAgent({
prewarm: async (proc: JobProcess) => {
proc.userData.vad = await silero.VAD.load();
},
entry: async (ctx: JobContext) => {
const vad = ctx.proc.userData.vad as silero.VAD;
const { stt, llm, tts } = createSpekoComponents({
speko,
vad,
intent: { language: 'en-US', optimizeFor: 'balanced' },
});
const session = new voice.AgentSession({ vad, stt, llm, tts });
await session.start({
agent: new voice.Agent({
instructions: 'You are a helpful voice assistant. Be concise.',
}),
room: ctx.room,
});
await ctx.connect();
session.generateReply({ instructions: 'Greet the user and offer your assistance.' });
},
});
cli.runApp(
new ServerOptions({
agent: fileURLToPath(import.meta.url),
agentName: 'speko-demo',
}),
);
```
## Architecture [#architecture]
The adapter exports three `@livekit/agents`-compatible classes — `SpekoSTT`, `SpekoLLM`, `SpekoTTS` — and a convenience factory `createSpekoComponents()` that wraps STT and TTS with `StreamAdapter` helpers so Speko's streaming REST proxy can participate in a streaming `voice.AgentSession`:
* **`SpekoSTT`** declares `{ streaming: false }`, so it must be wrapped with `new stt.StreamAdapter(spekoSTT, vad)` to segment utterances with VAD before calling `/v1/transcribe`.
* **`SpekoTTS`** is sentence-bounded in LiveKit, so it is wrapped with `new tts.StreamAdapter(spekoTTS, sentenceTokenizer)` before each streaming `/v1/synthesize` call.
* **`SpekoLLM`** is used directly — it's a `llm.LLM` backed by streaming `/v1/complete` responses.
`createSpekoComponents` handles the wrapping for you and returns `{ stt, llm, tts }` ready to pass to `voice.AgentSession`.
## v1 limitations [#v1-limitations]
* **STT request upload is utterance-bounded.** `/v1/transcribe` streams transcript events back, but this adapter still uploads one VAD-segmented WAV per utterance instead of full-duplex microphone audio.
* **TTS remains sentence-bounded in LiveKit.** `/v1/synthesize` streams audio bytes; the adapter still calls it once per tokenized sentence.
* **Tool calls are supported.** Inline tools return to the LiveKit runtime; registered webhook, builtin, and integration tools run server-side through `/v1/complete`.
* **TTS output format.** Accepts `audio/pcm;rate=NNNN` (Cartesia) and `audio/wav`. Throws on `audio/mpeg` (ElevenLabs MP3) — pick a routing intent that prefers Cartesia, or pin a PCM-capable provider via `constraints.allowedProviders.tts`.
* **STT input format.** Mono PCM16, encoded into a WAV wrapper per utterance. Multi-channel frames throw. Speko handles sample-rate conversion downstream — whatever the `AudioFrame` carries is what's uploaded.
## Reference [#reference]
* [`createSpekoComponents`](/adapter-livekit/create-speko-components) — convenience factory.
* [`SpekoSTT`](/adapter-livekit/speko-stt) — STT class.
* [`SpekoLLM`](/adapter-livekit/speko-llm) — LLM class.
* [`SpekoTTS`](/adapter-livekit/speko-tts) — TTS class.
* [`Intent`](/adapter-livekit/intent) — routing hint type and validator.
* [Audio helpers](/adapter-livekit/audio) — WAV encode/decode utilities.
# SpekoLLM (/adapter-livekit/speko-llm)
LiveKit Agents LLM adapter backed by POST /v1/complete.
`SpekoLLM` is a `llm.LLM` implementation. It flattens a LiveKit `ChatContext` into Speko's `messages` format and calls the proxy. The router picks the best LLM provider per intent and fails over automatically.
```ts
import { SpekoLLM } from '@spekoai/adapter-livekit';
const spekoLLM = new SpekoLLM({
speko,
intent: { language: 'en' },
temperature: 0.7,
maxTokens: 400,
});
```
Unlike STT and TTS, `SpekoLLM` doesn't need a `StreamAdapter`. It calls the
streaming `/v1/complete` endpoint through the SDK and emits a LiveKit
`LLMStream` chunk when the routed completion is ready.
## Constructor [#constructor]
```ts
new SpekoLLM(options: SpekoLLMOptions)
```
### `SpekoLLMOptions` [#spekollmoptions]
| Field | Type | Required | Description |
| ------------------------ | ----------------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `speko` | `Speko` | ✅ | `@spekoai/sdk` client. |
| `intent` | [`Intent`](/adapter-livekit/intent) | ✅ | Validated at construction time. |
| `temperature` | `number?` | | Forwarded to `/v1/complete`. |
| `maxTokens` | `number?` | | Forwarded to `/v1/complete`. |
| `constraints` | `PipelineConstraints?` | | Allow-list constraints. |
| `agentId` | `string?` | | When set, enables the registered-tools loader. The adapter calls `speko.agents.tools.listChatTools(agentId)` once per session — using the `speko` client for auth and base URL — and merges the result with LiveKit's runtime `ToolContext`. Registered tools win on collision. Omit to keep runtime-only behavior. See [tool calling](/guides/tool-calling). |
| `apiBaseUrl` | `string?` | | **Deprecated and ignored** — the loader reads the base URL from the `speko` client. Safe to omit. |
| `apiKey` | `string?` | | **Deprecated and ignored** — the loader reads the API key from the `speko` client. Safe to omit. |
| `onRegisteredToolsError` | `(err: Error) => void?` | | Called once if the registered-tools fetch fails. Soft degradation — the call continues with runtime-only tools rather than crashing. |
## Properties [#properties]
* `label() → 'speko.LLM'`
* `provider = 'speko'`
* `model = 'speko-router'`
## `.chat(params)` [#chatparams]
Standard LiveKit LLM entry point. Returns an `LLMStream` that emits a `ChatChunk`
carrying the assistant response or tool calls, then closes.
Signature (from `@livekit/agents`):
```ts
chat(params: {
chatCtx: llm.ChatContext;
toolCtx?: llm.ToolContext;
connOptions?: APIConnectOptions;
parallelToolCalls?: boolean;
toolChoice?: llm.ToolChoice;
extraKwargs?: Record;
}): llm.LLMStream;
```
The emitted chunk includes usage:
```ts
{
id: '',
delta: { role: 'assistant', content: result.text },
usage: {
promptTokens: result.usage.promptTokens,
completionTokens: result.usage.completionTokens,
promptCachedTokens: 0,
totalTokens: result.usage.promptTokens + result.usage.completionTokens,
},
}
```
## Context conversion — `chatContextToSpeko` [#context-conversion--chatcontexttospeko]
Exported for when you want to reuse the flattening logic (e.g. unit tests, custom pipelines).
```ts
import { chatContextToSpeko } from '@spekoai/adapter-livekit';
const messages = chatContextToSpeko(chatCtx);
```
Rules:
* Only `llm.ChatMessage` items are considered. Function-call and handoff items are skipped.
* Roles are normalised: `developer` → `system`; `system` / `user` / `assistant` pass through; anything else is dropped.
* Empty `textContent` messages are skipped.
* Ordering is preserved.
If the result is empty, `.chat()` rejects with `SpekoAdapterError('INVALID_CONTEXT')`.
## Tool Calls [#tool-calls]
Runtime tools from LiveKit's `toolCtx` are forwarded as inline tools. Registered
webhook, builtin, and integration tools can also be loaded by `agentId` — the
same set `speko.agents.tools.listChatTools(agentId)` returns — and executed
server-side by Speko before the final response is returned to LiveKit.
## Errors [#errors]
* `SpekoAdapterError` (exported): thrown for adapter-internal problems. `code` is one of:
* `'INVALID_CONTEXT'` — `ChatContext` produced no convertible messages.
API-layer errors from the underlying `speko.complete()` surface unchanged — `SpekoApiError`, `SpekoAuthError`, `SpekoRateLimitError` from `@spekoai/sdk`.
# SpekoSTT (/adapter-livekit/speko-stt)
LiveKit Agents STT adapter backed by POST /v1/transcribe.
`SpekoSTT` is a `stt.STT` implementation. It encodes each utterance's audio frames into a WAV payload and uploads it to the Speko proxy. The router picks the best STT provider for your `(language, region, optimizeFor)` and handles failover.
```ts
import { SpekoSTT } from '@spekoai/adapter-livekit';
import { stt as sttNs } from '@livekit/agents';
const spekoSTT = new SpekoSTT({
speko,
intent: { language: 'en-US' },
});
const wrapped = new sttNs.StreamAdapter(spekoSTT, vad);
```
## Constructor [#constructor]
```ts
new SpekoSTT(options: SpekoSTTOptions)
```
### `SpekoSTTOptions` [#spekosttoptions]
| Field | Type | Required | Description |
| ------------- | ----------------------------------- | -------- | -------------------------------------------- |
| `speko` | `Speko` | ✅ | `@spekoai/sdk` client. |
| `intent` | [`Intent`](/adapter-livekit/intent) | ✅ | Validated at construction time. |
| `constraints` | `PipelineConstraints?` | | Allow-list constraints passed on every call. |
The constructor calls `validateIntent(intent)` — a broken routing hint throws here rather than deep inside the first transcription.
## Properties [#properties]
* `label = 'speko.STT'`
* `provider = 'speko'`
* `model = 'speko-router'`
* `streaming = false`, `interimResults = false`
## Streaming requirement [#streaming-requirement]
`SpekoSTT.stream()` throws because this adapter uploads one VAD-segmented WAV per
utterance. The `/v1/transcribe` response itself streams transcript events, and
`speko.transcribe()` aggregates the final result for this class. Wrap the
instance:
```ts
import { stt } from '@livekit/agents';
const adapter = new stt.StreamAdapter(spekoSTT, vad);
```
Or use [`createSpekoComponents`](/adapter-livekit/create-speko-components) which does this for you.
## Per-utterance flow [#per-utterance-flow]
1. `StreamAdapter` + VAD segment the user's audio into utterances.
2. `SpekoSTT._recognize(frame, abortSignal)` is invoked for each utterance.
3. Frames are combined (`combineAudioFrames`) and encoded into PCM16 mono WAV via [`framesToWav`](/adapter-livekit/audio#framestowav).
4. The WAV is uploaded via `speko.transcribe()` with the intent header and any `constraints`.
5. The result is emitted as a single `FINAL_TRANSCRIPT` event with confidence defaulting to `1` when the upstream provider doesn't report one.
Aborts propagate: when the session tears down, the `AbortSignal` passed by `StreamAdapter` cancels the in-flight HTTP request.
## Mono-only [#mono-only]
Multi-channel audio throws at the WAV-encode step:
```
SpekoSTT: expected mono audio (1 channel), got 2. …
```
Configure your LiveKit `AgentSession` to pass mono audio, or pre-mix upstream.
# SpekoTTS (/adapter-livekit/speko-tts)
LiveKit Agents TTS adapter backed by POST /v1/synthesize.
`SpekoTTS` is a `tts.TTS` implementation. Each sentence is synthesised via the Speko proxy, decoded into PCM, chunked into `AudioFrame`s at 50 Hz (20 ms frames), and pushed to the LiveKit session.
```ts
import { SpekoTTS } from '@spekoai/adapter-livekit';
import { tts as ttsNs, tokenize } from '@livekit/agents';
const spekoTTS = new SpekoTTS({
speko,
intent: { language: 'en' },
voice: 'sonic-english',
sampleRate: 24_000,
});
const wrapped = new ttsNs.StreamAdapter(spekoTTS, new tokenize.basic.SentenceTokenizer());
```
## Constructor [#constructor]
```ts
new SpekoTTS(options: SpekoTTSOptions)
```
### `SpekoTTSOptions` [#spekottsoptions]
| Field | Type | Required | Description |
| ------------- | ----------------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------ |
| `speko` | `Speko` | ✅ | `@spekoai/sdk` client. |
| `intent` | [`Intent`](/adapter-livekit/intent) | ✅ | Validated at construction time. |
| `voice` | `string?` | | Voice id forwarded to the proxy. |
| `speed` | `number?` | | Speech-speed multiplier forwarded to the proxy. |
| `sampleRate` | `number?` | | Output sample rate advertised to LiveKit. Default `24000` (Cartesia Sonic). Must match what the upstream provider emits. |
| `constraints` | `PipelineConstraints?` | | Allow-list constraints. |
## Properties [#properties]
* `label = 'speko.TTS'`
* `provider = 'speko'`
* `model = 'speko-router'`
* `numChannels = 1`, `streaming = false`
## Streaming requirement [#streaming-requirement]
`SpekoTTS.stream()` throws because LiveKit's TTS `StreamAdapter` handles
sentence tokenization for this class. `/v1/synthesize` streams audio bytes for
each sentence request. Wrap:
```ts
import { tts, tokenize } from '@livekit/agents';
const adapter = new tts.StreamAdapter(spekoTTS, new tokenize.basic.SentenceTokenizer());
```
Or use [`createSpekoComponents`](/adapter-livekit/create-speko-components) which does this for you.
## `.synthesize(text, connOptions?, abortSignal?)` [#synthesizetext-connoptions-abortsignal]
Returns a `SpekoTTSChunkedStream` (exported for type use). Internally:
1. Calls `speko.synthesize(text, { ...intent, voice, speed, constraints })`.
2. Decodes the response via [`decodeSynthesisResult`](#decodesynthesisresult).
3. Rejects if the decoded sample rate doesn't match the configured one — ensures playback isn't pitched.
4. Chunks the PCM into `AudioFrame`s of `round(sampleRate / 50)` samples each via `AudioByteStream`.
5. Pushes frames onto the output queue, marking the last one `final: true`.
Empty provider output throws `SpekoTTS: provider returned empty audio`.
## Audio format support (v1) [#audio-format-support-v1]
`decodeSynthesisResult(result)` branches on `result.contentType`:
| Content type | Behavior |
| --------------------------- | ----------------------------------------------------------------------------------------------------------- |
| `audio/pcm;rate=NNNN` | Raw PCM, rate parsed from the MIME. Channels pinned to `1` (Cartesia's contract). |
| `audio/wav` / `audio/x-wav` | Header stripped via [`parseWav`](/adapter-livekit/audio#parsewav). Stereo WAV throws. |
| `audio/mpeg` | Throws — v1 doesn't include an MP3 decoder. Pin Cartesia or another PCM-capable provider via `constraints`. |
| anything else | Throws with provider info for debugging. |
Work around MP3 by pinning your TTS pool:
```ts
new SpekoTTS({
speko,
intent,
constraints: { allowedProviders: { tts: ['cartesia'] } },
});
```
## Sample-rate mismatch [#sample-rate-mismatch]
If `sampleRate` option and the decoded rate disagree, synthesis rejects:
```
SpekoTTS: provider returned audio at 16000 Hz but the TTS was configured for 24000 Hz. Either set `sampleRate: 16000` on SpekoTTS or pin the Speko router to a provider that matches the expected rate.
```
## `decodeSynthesisResult` [#decodesynthesisresult]
Exported for unit testing. Given a `SynthesizeResult`, returns `{ pcm, sampleRate, channels }`. Throws for unsupported content types (see table above).
```ts
import { decodeSynthesisResult } from '@spekoai/adapter-livekit';
```
# Callbacks & events (/client/callbacks)
Every hook VoiceConversation exposes, and when they fire.
All callbacks are optional. Pass them inside the `ConversationOptions` object. They're invoked synchronously on the media transport event loop — keep them fast or defer work with `queueMicrotask`.
## `ConversationStatus` [#conversationstatus]
```ts
type ConversationStatus = 'connecting' | 'connected' | 'disconnecting' | 'disconnected';
```
Transitions:
* **`connecting`** — the initial state, set the moment the `WebRTCConnection` is constructed.
* **`connected`** — after `room.connect()`, `createLocalAudioTrack()`, and `publishTrack()` all succeed.
* **`disconnecting`** — `endSession()` has been called but the room hasn't acknowledged yet.
* **`disconnected`** — the transport has fired `Disconnected`, OR an error during `connect()` (connection, mic) short-circuited to this state.
`onStatusChange` fires only on actual transitions; duplicate transitions are deduped.
## `ConversationMode` [#conversationmode]
```ts
type ConversationMode = 'listening' | 'speaking';
```
Mirrors transport active-speaker events: `speaking` when any remote participant is in the active-speakers set, `listening` otherwise. Useful for UI states like "agent talking now — show the voice animation".
Deduped on transition — `onModeChange` won't fire twice for the same mode.
## `ConversationMessage` [#conversationmessage]
```ts
interface ConversationMessage {
source: 'agent' | 'user';
text: string;
isFinal: boolean;
segmentId?: string;
}
```
`onMessage` fires from two sources — live transcriptions (the common case when talking to a Speko agent) and custom data-channel packets:
| Inbound event | Becomes |
| -------------------------- | -------------------------------------------------------------------------------------------------------- |
| Transcription segment | `{ source, text, isFinal, segmentId }` — `source` is `user` for the local participant, `agent` otherwise |
| `transcript` packet | `{ source: packet.source, text, isFinal: packet.isFinal ?? true }` |
| `agent_message` packet | `{ source: 'agent', text, isFinal: packet.isFinal ?? true }` |
| `user_message_echo` packet | `{ source: 'user', text, isFinal: true }` |
Transcription updates are **cumulative per segment**: the same `segmentId` is re-delivered with growing `text` (the agent's transcript streams word-by-word; the user's utterance is re-published in full on every recognizer update, and the final text can arrive more than once). Render by **upserting on `(source, segmentId)`** — replace that message's text in place, and only append when you see a new `segmentId`. Appending every message duplicates text, and keying only by `source` corrupts the transcript whenever user and agent updates interleave (which is normal). Messages from custom data packets carry no `segmentId`; append those.
See [Data channel protocol](/client/data-channel) for the raw wire format.
## `DisconnectionDetails` [#disconnectiondetails]
```ts
interface DisconnectionDetails {
reason: DisconnectionReason;
message?: string;
}
type DisconnectionReason = 'user' | 'agent' | 'error' | 'timeout' | 'unknown';
```
The SDK maps transport disconnect reasons into a smaller, intent-oriented set:
| Transport disconnect reason | Mapped `reason` |
| ------------------------------------------------ | --------------- |
| Client initiated | `user` |
| Participant removed / room deleted / room closed | `agent` |
| Join failure | `error` |
| everything else (including `undefined`) | `unknown` |
`message` is the raw transport enum name when available (useful for debugging / logging).
## `onConnect` [#onconnect]
```ts
onConnect?: (details: { conversationId: string }) => void;
```
Fires exactly once, after the mic is publishing and status is `connected`. `conversationId` is the transport conversation id (same value as `conversation.getId()`).
## `onError` [#onerror]
```ts
onError?: (error: Error) => void;
```
Non-fatal errors:
* Media device errors from the transport.
* Output device selection failures (`setSinkId` rejections).
Malformed or unrecognised inbound data packets are silently ignored — rooms carry data from other publishers (server control topics, future participants), so a packet that isn't part of the SDK protocol is not an error.
Fatal errors during `create()` are **thrown**, not routed to `onError`. See [Errors](/client/errors).
# Data channel protocol (/client/data-channel)
Wire format for packets exchanged between browser and agent over the media data channel.
Every non-audio signal — transcripts, overrides, user-typed messages — travels as JSON-encoded bytes on the reliable media data channel. `@spekoai/client` handles encoding and decoding internally; this page documents the wire format so server / agent implementations can interoperate.
## Encoding [#encoding]
* UTF-8 JSON, one message per `publishData` call.
* Reliable ordering (`reliable: true`).
* No framing beyond JSON — each `DataReceived` event is one complete packet.
## Outbound (browser → agent) [#outbound-browser--agent]
### `overrides` [#overrides]
Sent once, immediately after the mic publishes, if the browser passed an `overrides` option.
```json
{
"type": "overrides",
"overrides": {
"agent": {
"prompt": "You are a helpful receptionist.",
"firstMessage": "Hi, how can I help?",
"language": "en-US"
},
"tts": {
"voiceId": "sonic-english",
"speed": 1.0
}
}
}
```
Any subfield is optional. The agent worker is responsible for applying what it receives.
### `user_message` [#user_message]
Sent by `conversation.sendUserMessage(text)`. Use when the user types rather than speaks.
```json
{ "type": "user_message", "text": "I'd like to reschedule." }
```
### `contextual_update` [#contextual_update]
Sent by `conversation.sendContextualUpdate(text)`. Out-of-band context that shouldn't be treated as a turn.
```json
{ "type": "contextual_update", "text": "user switched to the checkout page" }
```
## Inbound (agent → browser) [#inbound-agent--browser]
### `transcript` [#transcript]
STT output for either speaker.
```json
{
"type": "transcript",
"source": "user",
"text": "Hello there.",
"isFinal": true
}
```
`isFinal` defaults to `true` when omitted.
### `agent_message` [#agent_message]
An assistant message emitted by the agent — typically streamed token-by-token as `isFinal: false` and closed with `isFinal: true`.
```json
{ "type": "agent_message", "text": "Happy to help!", "isFinal": true }
```
### `user_message_echo` [#user_message_echo]
Echo of a typed `user_message` so the UI can render it in the same transcript stream. `isFinal` is always implicitly `true`.
```json
{ "type": "user_message_echo", "text": "I'd like to reschedule." }
```
## Forwarding to `onMessage` [#forwarding-to-onmessage]
The SDK converts each inbound packet into a [`ConversationMessage`](/client/callbacks#conversationmessage):
```ts
// pseudocode
switch (packet.type) {
case 'transcript':
return { source: packet.source, text: packet.text, isFinal: packet.isFinal ?? true };
case 'agent_message':
return { source: 'agent', text: packet.text, isFinal: packet.isFinal ?? true };
case 'user_message_echo':
return { source: 'user', text: packet.text, isFinal: true };
}
```
Unknown packet types are ignored (no message fired, no error). Malformed JSON is ignored the same way — rooms carry data published for other consumers (server control topics, future participants), so a packet that isn't part of this protocol is not an error.
## Extending the protocol [#extending-the-protocol]
If you need a new packet type, add it on both sides:
1. Agent worker publishes a new `type` value.
2. Extend `InboundPacket` in `@spekoai/client` and handle it in `packetToMessage` (or ship a wrapper that subscribes to `room.on('dataReceived')` directly).
Outbound packet types are similarly open — `WebRTCConnection.publish(packet)` accepts any `OutboundPacket`, which you can widen in a fork.
# Errors (/client/errors)
SpekoClientError and its error codes.
The client SDK throws a single error class, `SpekoClientError`, tagged with a stable string code.
```ts
import { SpekoClientError } from '@spekoai/client';
import type { SpekoClientErrorCode } from '@spekoai/client';
```
## Shape [#shape]
```ts
class SpekoClientError extends Error {
code: SpekoClientErrorCode;
cause?: unknown; // original error when wrapping transport failures
}
```
## Codes [#codes]
| Code | Where it's thrown |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `CONNECTION_FAILED` | `VoiceConversation.create()` — transport connection rejected. `cause` is the original transport error. |
| `MICROPHONE_FAILED` | `create()` — mic acquisition or `publishTrack()` failed. The room is disconnected before this is thrown. |
| `NOT_CONNECTED` | `sendUserMessage` / `sendContextualUpdate` / internal `publish()` called while status isn't `connected`. |
| `INVALID_MESSAGE` | No longer raised. Malformed inbound data packets are silently ignored (rooms carry non-protocol data from other publishers). Kept in the type for compatibility. |
| `DISCONNECTED` | Reserved for future use. |
## Fatal vs non-fatal [#fatal-vs-non-fatal]
* **Fatal errors** (connection and microphone failures during `create()`) are **thrown** from the `create()` promise so callers can branch at construction time.
* **Non-fatal errors** (media device errors from the transport) are **routed to `onError`**. The session continues.
* **Non-protocol data packets** (malformed JSON, unknown shapes from other publishers in the room) are **silently ignored** — they never reach `onError`.
## Example [#example]
```ts
try {
const conv = await VoiceConversation.create({ ... });
} catch (err) {
if (err instanceof SpekoClientError) {
switch (err.code) {
case 'CONNECTION_FAILED':
// Token expired, network issue, or transport outage — ask user to retry.
break;
case 'MICROPHONE_FAILED':
// Permission denied or device in use — surface a permissions prompt.
break;
}
}
throw err;
}
```
# @spekoai/client (/client/overview)
Browser SDK for real-time voice conversations.
`@spekoai/client` is the browser-side companion to `@spekoai/sdk`. It connects a browser tab to a Speko voice session: capture the user's microphone, play the agent's audio, and exchange structured events such as transcripts and status changes.
Your server must mint a short-lived session token and return only the browser-safe session credentials. Never expose a Speko API key to browser code. For `VoiceConversation`, audio flows through Speko's browser media transport after the token is minted. For `RealtimeVoiceConversation`, audio flows browser ↔ Speko's S2S WebSocket proxy.
## Install [#install]
```bash
npm install @spekoai/client
# or
pnpm add @spekoai/client
```
The package does not expose low-level media transport types on its public surface, so most apps only import from `@spekoai/client` directly.
## Quick start [#quick-start]
### 1. Server mints a session [#1-server-mints-a-session]
```ts
// server side — using @spekoai/sdk or raw fetch
const session = await fetch('/v1/sessions', { ... });
// returns { transportToken, transportUrl, roomName, identity, expiresAt }
```
See [Build a voice agent](/guides/voice-agent) for the worker side and [Real-time browser conversation](/guides/realtime-conversation) for the end-to-end browser flow.
### 2. Browser joins the room [#2-browser-joins-the-room]
```ts
import { VoiceConversation } from '@spekoai/client';
const conversation = await VoiceConversation.create({
transportToken, // from server
transportUrl, // from server
onConnect: ({ conversationId }) => console.log('connected', conversationId),
onDisconnect: ({ reason }) => console.log('disconnected', reason),
onMessage: ({ source, text, isFinal }) =>
console.log(source, text, isFinal),
onStatusChange: (status) => console.log('status', status),
onModeChange: (mode) => console.log('mode', mode),
onError: (err) => console.error(err),
});
await conversation.setMicMuted(true);
conversation.setVolume(0.8);
conversation.sendUserMessage('hello');
conversation.sendContextualUpdate('user switched to the checkout page');
await conversation.endSession();
```
## What the SDK owns [#what-the-sdk-owns]
* Connecting with supplied short-lived session credentials.
* Acquiring the microphone with sensible constraints (echo cancellation, noise suppression, auto gain — all togglable via `audioConstraints`).
* Playing remote audio.
* Parsing inbound data-channel packets (transcripts, agent messages) and invoking your callbacks.
* Sending outbound packets — overrides, user messages, contextual updates.
* Mic mute, speaker volume, output device selection.
* Tearing everything down on disconnect, including releasing the OS microphone capture.
## What it doesn't do [#what-it-doesnt-do]
* **Mint sessions from API keys.** Keep `SPEKO_API_KEY` on your server. Browser code should only receive short-lived session tokens.
* **Retries.** A failed `connect()` throws a [`SpekoClientError`](/client/errors). Retry logic belongs in your app's UX.
* **Tool calls, guardrail hooks, MCP, VAD score streaming.** Deferred — see the package's `ROADMAP.md`.
## Reference [#reference]
* [VoiceConversation](/client/voice-conversation) — the primary API surface.
* [RealtimeVoiceConversation](/client/realtime-voice-conversation) — browser capture/playback for S2S WebSocket sessions.
* [Callbacks & events](/client/callbacks) — every hook the SDK exposes.
* [Data channel protocol](/client/data-channel) — wire format for inbound / outbound packets.
* [Errors](/client/errors) — `SpekoClientError` and its codes.
# RealtimeVoiceConversation (/client/realtime-voice-conversation)
Browser capture and playback for direct speech-to-speech WebSocket sessions.
`RealtimeVoiceConversation` is the browser-side helper for Speko speech-to-speech (S2S) sessions. It connects directly to the S2S WebSocket returned by `POST /v1/sessions`, captures the microphone as PCM16, plays streamed PCM16 responses, and forwards transcript and status callbacks.
Use it when you want the lowest-latency S2S path and do not need the browser media transport used by [`VoiceConversation`](/client/voice-conversation).
```ts
import { RealtimeVoiceConversation } from '@spekoai/client';
```
## Mint the session on your server [#mint-the-session-on-your-server]
Create S2S sessions on your backend so `SPEKO_API_KEY` never reaches the browser. Return only the short-lived WebSocket credentials.
```ts server.ts
app.post('/api/realtime-session', async (_req, res) => {
const response = await fetch('https://api.speko.dev/v1/sessions', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.SPEKO_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
mode: 's2s',
s2s: {
provider: 'openai',
model: 'gpt-realtime',
voice: 'alloy',
systemPrompt: 'You are a concise voice assistant.',
},
ttlSeconds: 900,
}),
});
if (!response.ok) {
res.status(response.status).json({ error: 'Could not start realtime session' });
return;
}
const session = await response.json();
res.json({
sessionId: session.sessionId,
wsUrl: session.wsUrl,
wsToken: session.wsToken,
expiresAt: session.expiresAt,
inputSampleRate: session.inputSampleRate,
outputSampleRate: session.outputSampleRate,
});
});
```
## Connect from the browser [#connect-from-the-browser]
```tsx RealtimePanel.tsx
import { useEffect, useRef, useState } from 'react';
import { RealtimeVoiceConversation } from '@spekoai/client';
export function RealtimePanel() {
const convRef = useRef(null);
const [status, setStatus] = useState('idle');
const [transcript, setTranscript] = useState([]);
async function start() {
setStatus('connecting');
const session = await fetch('/api/realtime-session', {
method: 'POST',
}).then((r) => r.json());
const conv = await RealtimeVoiceConversation.create({
...session,
onConnect: ({ conversationId }) => {
console.log('connected', conversationId);
},
onStatusChange: setStatus,
onMessage: ({ source, text, isFinal }) => {
if (isFinal) setTranscript((items) => [...items, `${source}: ${text}`]);
},
onError: (err) => console.error(err),
onDisconnect: () => setStatus('idle'),
});
convRef.current = conv;
}
async function stop() {
await convRef.current?.endSession();
convRef.current = null;
}
useEffect(() => () => { void convRef.current?.endSession(); }, []);
return (
Start
Stop
Status: {status}
{transcript.map((item, i) => {item} )}
);
}
```
## `RealtimeVoiceConversation.create(options)` [#realtimevoiceconversationcreateoptions]
```ts
static create(options: RealtimeConversationOptions): Promise
```
`create()` opens the WebSocket, waits for a `ready` frame, starts microphone capture, then resolves.
### `RealtimeConversationOptions` [#realtimeconversationoptions]
| Field | Type | Required | Description |
| ------------------ | ----------------------------------- | -------- | -------------------------------------------------------------------------- |
| `sessionId` | `string` | yes | Server-assigned session id. Also returned by `getId()`. |
| `wsUrl` | `string` | yes | Short-lived S2S WebSocket URL returned by `POST /v1/sessions`. |
| `wsToken` | `string` | yes | Short-lived WebSocket token. Sent as the first WebSocket subprotocol. |
| `expiresAt` | `string?` | | ISO-8601 expiry for the WebSocket token. |
| `inputSampleRate` | `16000 \| 24000?` | | Requested capture rate. Defaults to `24000`; the server can negotiate it. |
| `outputSampleRate` | `16000 \| 24000?` | | Requested playback rate. Defaults to `24000`; the server can negotiate it. |
| `inputDeviceId` | `string?` | | Specific microphone `deviceId`. |
| `audioConstraints` | `AudioConstraints?` | | `echoCancellation`, `noiseSuppression`, `autoGainControl`. |
| `onConnect` | `(d: { conversationId }) => void` | | Fired after the socket is ready and microphone capture has started. |
| `onDisconnect` | `(d: DisconnectionDetails) => void` | | Fired when the client or socket closes. |
| `onMessage` | `(m: ConversationMessage) => void` | | Transcript frames mapped to `{ source, text, isFinal }`. |
| `onStatusChange` | `(s: ConversationStatus) => void` | | `connecting`, `connected`, `disconnecting`, or `disconnected`. |
| `onModeChange` | `(m: ConversationMode) => void` | | `speaking` while response audio is queued, otherwise `listening`. |
| `onError` | `(err: Error) => void` | | WebSocket transport errors and provider error frames. |
## Instance methods [#instance-methods]
### `getId(): string` [#getid-string]
Returns the `sessionId` passed to `create()`.
### `isOpen(): boolean` [#isopen-boolean]
`true` while the SDK status is `connected` and the WebSocket is open.
### `setMicMuted(muted: boolean): Promise` [#setmicmutedmuted-boolean-promisevoid]
Mute or unmute local microphone capture. Muting disables the media track and stops PCM frames from being sent.
### `setVolume(volume: number): void` [#setvolumevolume-number-void]
Set response playback volume from `0` to `1`. Values outside that range are clamped.
### `endSession(): Promise` [#endsession-promisevoid]
Close the WebSocket, stop microphone tracks, clear queued playback, close the `AudioContext`, and transition to `disconnected`.
## Transport notes [#transport-notes]
* The SDK passes `wsToken` as the first WebSocket subprotocol because browsers cannot set custom headers on `new WebSocket()`.
* Outbound microphone audio is sent as 20 ms PCM16 binary frames at the negotiated input sample rate.
* Inbound binary frames are PCM16 response audio at the negotiated output sample rate.
* JSON frames with `t: 'transcript'` are forwarded to `onMessage`. JSON frames with `t: 'error'` are forwarded to `onError`.
* `AudioWorklet` capture is used when available; the SDK falls back to `ScriptProcessorNode` for older browsers.
# VoiceConversation (/client/voice-conversation)
Primary API — construct, control, and tear down a voice session.
`VoiceConversation` is the public class exported from `@spekoai/client`. Always construct it via the static `create()` factory — the factory awaits connection, so by the time it resolves the session is live.
```ts
import { VoiceConversation } from '@spekoai/client';
```
There is also a legacy namespace export, `Conversation`, with a single method `Conversation.startSession(options)` — it's an alias for `VoiceConversation.create(options)`, kept so consumers migrating from other SDKs can use familiar naming.
## `VoiceConversation.create(options)` [#voiceconversationcreateoptions]
```ts
static create(options: CreateOptions): Promise
```
Where `CreateOptions` is the short-lived token shape:
```ts
type CreateOptions = ConversationOptions;
```
Your backend calls `POST /v1/sessions`, optionally with an [`agentId`](/concepts/agents), and forwards only `transportToken` and `transportUrl` to the browser. `VoiceConversation.create()` connects to the media transport, publishes the microphone track, sends any [`overrides`](#conversationoverrides) over the data channel, fires `onConnect`, and resolves. It throws a [`SpekoClientError`](/client/errors) on connection, network, or microphone failure.
Do not send `SPEKO_API_KEY` to browser code. `VoiceConversation` no longer accepts `agentId`, `apiKey`, or `apiBaseUrl`; session minting belongs on your server.
### `ConversationOptions` [#conversationoptions]
| Field | Type | Required | Description |
| ------------------ | ----------------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `transportToken` | `string` | ✅ | Browser-safe media transport token, returned by your server. |
| `transportUrl` | `string` | ✅ | Media transport URL, returned by your server. Pass it straight through — the SDK does not default this so consumers can't ship against the wrong environment. |
| `overrides` | `ConversationOverrides?` | | Per-session agent / TTS overrides. Sent over the data channel right after connect. |
| `inputDeviceId` | `string?` | | Specific microphone `deviceId`. |
| `outputDeviceId` | `string?` | | Specific speaker `deviceId`. Applied via `setSinkId`; silently ignored on browsers without support. |
| `audioConstraints` | `AudioConstraints?` | | `echoCancellation`, `noiseSuppression`, `autoGainControl`. All default `true`. |
| `onConnect` | `(d: { conversationId }) => void` | | Fired after the mic publishes and status becomes `connected`. |
| `onDisconnect` | `(d: DisconnectionDetails) => void` | | Fired on server or client disconnect. |
| `onMessage` | `(m: ConversationMessage) => void` | | Inbound transcripts, agent messages, user-message echoes. |
| `onStatusChange` | `(s: ConversationStatus) => void` | | `connecting → connected → disconnecting → disconnected`. |
| `onModeChange` | `(m: ConversationMode) => void` | | `listening` vs `speaking`, derived from transport active-speaker events. |
`conversationToken` and `livekitUrl` are still accepted as legacy aliases for existing callers.
\| `onError` | `(err: Error) => void` | | Non-fatal errors (malformed data packets, media device errors, sink-id failures). |
### `ConversationOverrides` [#conversationoverrides]
```ts
interface ConversationOverrides {
agent?: {
prompt?: string;
firstMessage?: string;
language?: string;
};
tts?: {
voiceId?: string;
speed?: number;
};
}
```
Overrides are JSON-serialized and published over the data channel immediately after the mic is live. The agent worker can read them and reconfigure the session before its first reply.
### `AudioConstraints` [#audioconstraints]
```ts
interface AudioConstraints {
echoCancellation?: boolean; // default true
noiseSuppression?: boolean; // default true
autoGainControl?: boolean; // default true
}
```
The SDK always routes through `createLocalAudioTrack({ ... })` rather than `setMicrophoneEnabled(true)` so that constraints are applied even when no `inputDeviceId` is passed — `setMicrophoneEnabled` silently ignores them in that case.
## Instance methods [#instance-methods]
### `getId(): string` [#getid-string]
Returns the transport conversation id. Populated after `create()` resolves.
### `isOpen(): boolean` [#isopen-boolean]
`true` while the underlying status is `connected`.
### `setMicMuted(muted: boolean): Promise` [#setmicmutedmuted-boolean-promisevoid]
Mute / unmute the local microphone track. Uses the track-level mute API when a track is attached; falls back to `LocalParticipant.setMicrophoneEnabled()` otherwise.
### `setVolume(volume: number): void` [#setvolumevolume-number-void]
Set playback volume for every remote audio element (0–1, clamped). Applied immediately to existing elements and to future ones.
### `sendUserMessage(text: string): void` [#sendusermessagetext-string-void]
Publish a `user_message` packet over the reliable data channel. Use when the user types rather than speaks — the agent receives it inline with its transcript stream.
### `sendContextualUpdate(text: string): void` [#sendcontextualupdatetext-string-void]
Publish a `contextual_update` packet. Use for out-of-band context (e.g. "user switched to the checkout page"). Separate from `user_message` so agents can treat it as system-level context rather than a turn.
### `endSession(): Promise` [#endsession-promisevoid]
Initiate clean disconnection. Sets status to `disconnecting`, asks the transport to disconnect; the disconnect event completes the teardown (stops the mic track, removes audio elements, fires `onDisconnect`). Idempotent — calling it twice is a no-op.
## Teardown invariants [#teardown-invariants]
When disconnection completes (whether triggered by `endSession()`, agent leaving, token expiry, or error), the SDK:
1. Sets status to `disconnected` and fires `onStatusChange`.
2. Stops the local microphone track so the browser's mic indicator goes away.
3. Detaches and removes every `` element it added to `document.body`.
4. Fires `onDisconnect` with a mapped [`DisconnectionReason`](/client/callbacks#disconnectionreason).
Your component's unmount effect should call `endSession()` so navigating away doesn't leak a live transport session.
# callbacks (/sdk/callbacks)
Scheduled callback resources created from call analysis.
`speko.callbacks` lists and controls scheduled callbacks. Callbacks are usually created by post-call analysis when the caller requested a follow-up.
```ts
const { callbacks } = await speko.callbacks.list({ status: 'scheduled' });
await speko.callbacks.dispatch(callbacks[0]!.id);
```
## Methods [#methods]
| Method | Description |
| ----------------------------- | ----------------------------------------------------------------------------- |
| `list(params?)` | List scheduled callbacks. Filter by `status`, `sourceSessionId`, and `limit`. |
| `get(callbackId)` | Fetch one callback. |
| `cancel(callbackId, params?)` | Cancel a callback with an optional reason. |
| `dispatch(callbackId)` | Dispatch a callback immediately. |
## ScheduledCallback [#scheduledcallback]
| Field | Type | Description |
| -------------------- | ------------------------------------------------------------------------- | ---------------------------------------------- |
| `id` | `string` | Callback id. |
| `source_session_id` | `string \| null` | Call that requested the callback. |
| `created_session_id` | `string \| null` | Outbound call session created when dispatched. |
| `agent_id` | `string \| null` | Agent to run. |
| `phone_number_id` | `string \| null` | Caller ID phone-number row. |
| `to_number` | `string` | Destination number. |
| `from_number` | `string \| null` | Caller ID. |
| `scheduled_at` | `string` | ISO timestamp. |
| `status` | `'scheduled' \| 'dispatching' \| 'dispatched' \| 'cancelled' \| 'failed'` | Callback state. |
| `pipeline_config` | `Record` | Resolved call pipeline. |
| `metadata` | `Record` | Free-form metadata and analysis context. |
# calls (/sdk/calls)
Inspect calls, reports, events, recordings, and transfers.
`speko.calls` exposes operational data and live transfer controls for voice calls.
```ts
const detail = await speko.calls.get('call_123');
const { events } = await speko.calls.events('call_123');
const recording = await speko.calls.recording('call_123');
```
The call id is the Speko voice session id returned by `speko.voice.dial()` and surfaced in inbound call metadata and webhooks.
## Methods [#methods]
| Method | Description |
| --------------------------------------------------- | ---------------------------------------------------------------------- |
| `get(callId)` | Fetch call detail, transcript, report, and transfer attempts. |
| `events(callId)` | List lifecycle, SIP, and transfer events. |
| `report(callId)` | Fetch the post-call report. |
| `finalizeReport(callId, params?)` | Run or rerun analysis and optionally retry post-call webhook delivery. |
| `recording(callId)` | Return a signed recording URL. |
| `blindTransfer(callId, params)` | Transfer the active SIP participant directly. |
| `warmTransfer(callId, params)` | Start a consultative transfer with optional sequential destinations. |
| `completeWarmTransfer(callId, transferId, params?)` | Bridge the screened recipient into the original room. |
| `cancelWarmTransfer(callId, transferId, params?)` | Cancel a warm transfer, optionally trying the next destination. |
## Agent call history [#agent-call-history]
Call listing is scoped through agents:
```ts
const page = await speko.agents.listCalls('agent_123', { limit: 25 });
if (page.next_cursor) {
await speko.agents.listCalls('agent_123', { cursor: page.next_cursor });
}
```
## Warm transfer [#warm-transfer]
```ts
const transfer = await speko.calls.warmTransfer('call_123', {
from: '+12015550199',
destinations: [
{ to: '+12015551234', label: 'Front desk' },
{ to: '+12015554321', label: 'Overflow' },
],
screeningPrompt: 'Confirm the recipient can help before bridging.',
fallback: { strategy: 'take_message' },
voicemailDetection: { mode: 'agent', timeoutSeconds: 10 },
});
```
Warm transfers return the active transfer plus `routing_attempts`. Cancelling with `tryNext: true` asks the server to continue with the next destination when one exists.
## Reports and webhooks [#reports-and-webhooks]
`report(callId)` returns the finalized post-call report when available: summary, outcome, structured data, transcript entries, cost breakdown, artifacts, metadata, and scheduled callback context. `finalizeReport(callId, { forceAnalysis: true, retryWebhook: true })` reruns analysis and retries delivery of the agent `postCall` webhook.
Use `events(callId)` for the durable event stream. It includes Speko, LiveKit, Telnyx, SIP status, failure, and transfer events, which is the closest equivalent to a carrier status timeline.
# complete (/sdk/complete)
POST /v1/complete — LLM completion with automatic provider routing.
Run an LLM completion. The router picks the best LLM provider for your intent and fails over automatically.
```ts
const { text, provider } = await speko.complete({
messages: [{ role: 'user', content: 'Hi!' }],
intent: { language: 'en' },
});
```
## Signature [#signature]
```ts
speko.complete(
params: CompleteParams,
abortSignal?: AbortSignal,
): Promise
speko.completeStream(
params: CompleteParams,
abortSignal?: AbortSignal,
): AsyncIterable
```
## Parameters [#parameters]
### `params: CompleteParams` [#params-completeparams]
| Field | Type | Description |
| ------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
| `messages` | `ChatMessage[]` | Conversation history. Roles: `system`, `user`, `assistant`, `tool`. |
| `intent` | `RoutingIntent` | `language`, optional `region` and `optimizeFor`. |
| `systemPrompt` | `string?` | Shortcut for a leading system message. Providers that distinguish the system channel use it natively; others fold it into the message list. |
| `temperature` | `number?` | Forwarded to the provider. Defaults to the provider's default. |
| `maxTokens` | `number?` | Max completion tokens. Defaults to the provider's default. |
| `reasoningEffort` | `'none' \| 'minimal' \| 'low' \| 'medium' \| 'high' \| 'xhigh'?` | OpenAI reasoning-model effort override. Defaults are tuned from `intent.optimizeFor`. |
| `constraints` | `PipelineConstraints?` | Allow-list constraints. |
| `tools` | `ChatTool[]?` | JSON Schema tool definitions exposed to the model. |
| `toolChoice` | `ChatToolChoice?` | `auto`, `none`, `required`, or a specific function name. |
| `parallelToolCalls` | `boolean?` | Provider hint for whether multiple tool calls may be emitted in one turn. |
| `maxToolHops` | `number?` | Server-side hop cap for webhook or builtin tools. Defaults to 8. |
### `ChatMessage` [#chatmessage]
```ts
interface ChatMessage {
role: 'system' | 'user' | 'assistant' | 'tool';
content: string;
toolCalls?: ChatToolCall[];
toolCallId?: string;
isError?: boolean;
}
interface ChatToolCall {
id: string;
name: string;
args: string;
}
interface ChatTool {
name: string;
description: string;
parameters: Record;
executionMode?: 'inline' | 'webhook' | 'builtin';
}
```
### `abortSignal?: AbortSignal` [#abortsignal-abortsignal]
Cancel an in-flight request.
## Returns [#returns]
### `CompleteResult` [#completeresult]
| Field | Type | Description |
| ------------------------ | ----------------- | ------------------------------------------------------------------------- |
| `text` | `string` | Assistant reply. |
| `provider` | `string` | Upstream LLM provider (e.g. `openai`, `anthropic`, `groq`). |
| `model` | `string` | Provider-specific model id. |
| `usage.promptTokens` | `number` | Prompt token count. |
| `usage.completionTokens` | `number` | Completion token count. |
| `failoverCount` | `number` | Providers tried before this one succeeded. |
| `scoresRunId` | `string \| null` | Scoring run id that selected this provider. |
| `toolCalls` | `ChatToolCall[]?` | Tool calls emitted by the assistant when inline tool execution is needed. |
## Streaming [#streaming]
The wire response is `text/event-stream` with `meta`, `delta`, `tool_call`,
`server_tool_call`, `done`, and `error` events. `speko.complete()` consumes that
stream and returns the final `CompleteResult`; use `speko.completeStream()` to
render deltas or tool-call progress as it arrives.
## Tool execution [#tool-execution]
Tools can run inline in your worker, through Speko-managed webhooks, or as builtins. Omitting `executionMode` preserves the inline behavior: the model's tool calls return in `toolCalls`, and your app adds `role: 'tool'` messages before calling `complete()` again. Webhook and builtin tools are executed by Speko server-side and may emit `server_tool_call` streaming events before the final response.
## Example: multi-turn [#example-multi-turn]
```ts
const messages: ChatMessage[] = [
{ role: 'system', content: 'You are a concise voice assistant.' },
{ role: 'user', content: 'Book me an appointment for Tuesday.' },
];
const first = await speko.complete({
messages,
intent: { language: 'en' },
temperature: 0.3,
maxTokens: 200,
});
messages.push({ role: 'assistant', content: first.text });
messages.push({ role: 'user', content: '3pm, with Dr. Chen.' });
const second = await speko.complete({
messages,
intent: { language: 'en' },
});
```
## Example: pin a provider [#example-pin-a-provider]
```ts
await speko.complete({
messages: [...],
intent: { language: 'en' },
constraints: { allowedProviders: { llm: ['anthropic'] } },
});
```
# credits (/sdk/credits)
Prepaid balance and append-only ledger.
Query the organization's prepaid credit balance and walk the ledger of every credit movement. Exposed via `speko.credits`.
Balances are returned in USD. Ledger amounts still use micro-USD strings because ledger entries are signed accounting units.
```ts
const { balanceUsd } = await speko.credits.getBalance();
if (balanceUsd < 0.5) showLowBalanceBanner();
```
## `speko.credits.getBalance()` [#spekocreditsgetbalance]
### Signature [#signature]
```ts
speko.credits.getBalance(): Promise
```
### Returns — `OrganizationBalance` [#returns--organizationbalance]
| Field | Type | Description |
| ------------ | -------- | -------------------------------------------- |
| `balanceUsd` | `number` | Current prepaid balance in USD. |
| `currency` | `'USD'` | Currency for `balanceUsd`. |
| `updatedAt` | `string` | ISO-8601 timestamp of the last ledger event. |
## `speko.credits.getLedger(params?)` [#spekocreditsgetledgerparams]
Most-recent-first page of credit movements (grants, debits, topups, refunds, adjustments). Pass the previous response's `nextCursor` back as `cursor` to continue; `null` means the history is exhausted.
### Signature [#signature-1]
```ts
speko.credits.getLedger(
params?: CreditLedgerQueryParams,
): Promise
```
### `CreditLedgerQueryParams` [#creditledgerqueryparams]
| Field | Type | Description |
| -------- | --------- | --------------------------------------------- |
| `limit` | `number?` | Page size. Server default applies if omitted. |
| `cursor` | `string?` | `nextCursor` from a previous response. |
### Returns — `CreditLedgerPage` [#returns--creditledgerpage]
| Field | Type | Description |
| ------------ | --------------------- | -------------------------------------------------------- |
| `entries` | `CreditLedgerEntry[]` | Page contents. |
| `nextCursor` | `string \| null` | Pass back as `cursor` for the next page, `null` if done. |
### `CreditLedgerEntry` [#creditledgerentry]
| Field | Type | Description |
| ---------------- | ----------------------------------------------------------- | ---------------------------------------------------------------- |
| `id` | `string` | Ledger entry id. |
| `kind` | `'grant' \| 'debit' \| 'topup' \| 'refund' \| 'adjustment'` | Movement type. |
| `amountMicroUsd` | `string` | Signed. Positive for grants/topups/refunds, negative for debits. |
| `metric` | `string \| null` | Metric when tied to a usage row. |
| `provider` | `string \| null` | Upstream provider the debit was applied to. |
| `sessionId` | `string \| null` | Session the debit was applied against. |
| `createdAt` | `string` | ISO-8601. |
## Example — paginate the ledger [#example--paginate-the-ledger]
```ts
let cursor: string | undefined = undefined;
do {
const page = await speko.credits.getLedger({ limit: 50, cursor });
for (const entry of page.entries) {
console.log(entry.createdAt, entry.kind, entry.amountMicroUsd);
}
cursor = page.nextCursor ?? undefined;
} while (cursor);
```
# Errors (/sdk/errors)
Exception classes thrown by @spekoai/sdk.
Every non-2xx API response is re-thrown as a typed error. Import from the package root:
```ts
import { SpekoApiError, SpekoAuthError, SpekoRateLimitError } from '@spekoai/sdk';
```
## `SpekoApiError` [#spekoapierror]
Base class for all API errors. Carries HTTP status plus a server-provided `code`.
```ts
class SpekoApiError extends Error {
status: number;
code: string;
}
```
Thrown on any non-2xx response that isn't a 401 or 429. `code` is parsed from the JSON body (`{ "error": "...", "code": "..." }`); falls back to `"UNKNOWN"` when the body isn't JSON.
## `SpekoAuthError` [#spekoautherror]
Thrown on HTTP 401. Extends `SpekoApiError` with `status: 401`, `code: 'AUTH_ERROR'`.
```ts
try {
await speko.complete({ ... });
} catch (err) {
if (err instanceof SpekoAuthError) {
// Prompt the user to rotate their API key.
}
}
```
## `SpekoRateLimitError` [#spekoratelimiterror]
Thrown on HTTP 429. Extends `SpekoApiError` with `code: 'RATE_LIMITED'` and a parsed `retryAfter` (seconds) from the `Retry-After` response header.
```ts
class SpekoRateLimitError extends SpekoApiError {
retryAfter: number | null;
}
```
Example backoff:
```ts
async function withRetry(fn: () => Promise): Promise {
try {
return await fn();
} catch (err) {
if (err instanceof SpekoRateLimitError) {
const wait = (err.retryAfter ?? 1) * 1000;
await new Promise((r) => setTimeout(r, wait));
return fn();
}
throw err;
}
}
```
## Timeouts and cancellation [#timeouts-and-cancellation]
When the internal timeout fires or an external `AbortSignal` is aborted, `fetch` rejects with a `DOMException` named `AbortError`. The SDK does **not** re-wrap these — callers can distinguish abort from API failure with a standard `instanceof` / `.name === 'AbortError'` check.
# @spekoai/sdk (/sdk/overview)
Official TypeScript SDK — one API, every voice provider.
`@spekoai/sdk` is the server-side TypeScript SDK for the Speko voice gateway. It wraps the one-shot proxy endpoints — `/v1/transcribe`, `/v1/synthesize`, `/v1/complete` — plus usage, credits, realtime S2S, outbound voice dialing, phone numbers, agents, registered tools, and knowledge bases. Speko benchmarks providers per `(language, region, optimizeFor)` and routes each STT, LLM, and TTS call to the best available option. Failover is handled server-side. You ship one integration; providers rotate without a code change.
## Install [#install]
```bash
npm install @spekoai/sdk
# or
pnpm add @spekoai/sdk
```
Runtime: Node 18+ (uses global `fetch` and `AbortController`). Also works in Bun, Deno, and any runtime with `fetch`.
## Quickstart [#quickstart]
```ts
import { Speko } from '@spekoai/sdk';
import { readFile } from 'node:fs/promises';
const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
// Transcribe
const audio = await readFile('./call.wav');
const { text, provider, confidence } = await speko.transcribe(audio, {
language: 'es-MX',
});
// Synthesize
const speech = await speko.synthesize('Hello world', {
language: 'en',
});
// Complete
const { text: reply } = await speko.complete({
messages: [{ role: 'user', content: 'Hi!' }],
intent: { language: 'en' },
});
```
## What the router does [#what-the-router-does]
Every routed one-shot call takes a `RoutingIntent` (`language`, optional `region` and `optimizeFor`). Speko scores every provider for that intent against its continuously-updated benchmark set, picks the top-ranked one, and fails over to the next-best if the primary errors. Response headers (`X-Speko-Provider`, `X-Speko-Model`, `X-Speko-Failover-Count`, `X-Speko-Scores-Run-Id`) are surfaced on every result object so you can log what actually ran.
If you need to restrict the pool (for compliance, cost caps, or pinning a provider while debugging), pass `constraints.allowedProviders[modality]`. The router still ranks by score — it just picks the top-ranked candidate from your allow-list.
## Cancellation [#cancellation]
Every method accepts an optional `AbortSignal` as the last argument. The signal is composed with the client's timeout (30 s by default), so external cancellation wins whenever it fires first. Frameworks like `@spekoai/adapter-livekit` pass their own signals through when a session tears down mid-call.
## Errors [#errors]
All API failures throw `SpekoApiError` or one of its subclasses:
* `SpekoAuthError` — 401, bad or missing API key.
* `SpekoRateLimitError` — 429, carries `retryAfter` seconds parsed from `Retry-After`.
* `SpekoApiError` — any other non-2xx response. `status` and `code` are populated from the JSON body.
See [Errors](/sdk/errors) for details.
## Reference [#reference]
Constructor, options, concurrency, auth.
`POST /v1/transcribe`.
`POST /v1/synthesize`.
`POST /v1/complete`.
Speech-to-speech WebSocket sessions.
Outbound phone calls.
Managed numbers, SIP imports, and business verification.
Call detail, events, reports, recordings, and transfers.
Scheduled follow-up calls.
`GET /v1/usage`.
Balance and ledger.
Persisted voice personas and registered tools.
Shared request / response types.
Exception classes.
# phone numbers (/sdk/phone-numbers)
Provision managed numbers, import SIP trunk numbers, and manage phone-number business verification.
`speko.phoneNumbers` manages caller IDs and inbound numbers for the authenticated organization.
```ts
const numbers = await speko.phoneNumbers.list();
const imported = await speko.phoneNumbers.importSipTrunk({
e164: '+12015550199',
sipConnectionInstallationId: '00000000-0000-4000-8000-000000000010',
direction: 'both',
agentId: 'agent_123',
});
```
Managed number purchase is the US-number path and requires phone-number business verification plus sufficient credits. For numbers you already own, or non-US carrier paths, import a SIP trunk number and link it to an agent.
Inbound routing is controlled by `direction`, `agentId`, and `dispatchMetadataTemplate`. A number that allows inbound calls must have either a linked agent or a dispatch metadata template. Linked agents hydrate the call prompt, routing intent, provider preferences, tools, and lifecycle webhooks; templates add static or token-substituted metadata.
## Methods [#methods]
| Method | Description |
| -------------------------- | ---------------------------------------------------------------------------- |
| `list()` | List all organization phone numbers. |
| `searchAvailable(params?)` | Search platform-managed numbers available to buy. |
| `get(id)` | Fetch one phone-number row. |
| `create(params)` | Buy a managed number. Requires business verification and sufficient credits. |
| `importSipTrunk(params)` | Register a customer-owned SIP-trunk number. |
| `update(id, params)` | Update direction, metadata template, label, or linked agent. |
| `delete(id)` | Release or unregister a phone number. |
| `getKyb()` | Read business verification state. |
| `saveKybDraft(params)` | Save a business verification draft. |
| `submitKyb(params)` | Submit business verification for review. |
## PhoneNumberRow [#phonenumberrow]
| Field | Type | Description |
| ----------------------------------------------------------- | ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `id` | `string` | Speko phone-number id. |
| `e164` | `string` | E.164 phone number. |
| `source` | `'managed' \| 'sip_trunk'` | Platform-managed number or customer SIP-trunk number. |
| `providerResourceId` | `string \| null` | Platform-neutral managed provider id. |
| `sipConnectionInstallationId` | `string \| null` | Installed SIP connection used for productized SIP imports. |
| `sipProviderName` | `string \| null` | Display label for the SIP provider/account. |
| `direction` | `'inbound' \| 'outbound' \| 'both'` | Allowed call direction. |
| `agentId` | `string \| null` | Agent linked for inbound calls. |
| `dispatchMetadataTemplate` | `Record \| null` | Optional template merged into inbound session metadata. Supports tokens such as `{{callerNumber}}`, `{{dialedNumber}}`, and `{{forwardedFromNumber}}`. |
| `setupStatus` | `PhoneNumberSetupStatus` | Readiness state for inbound/outbound use. |
| `sms10dlcProfileId`, `smsCampaignId`, `smsAssignmentStatus` | nullable | SMS assignment state when applicable. |
Pass `null` to `update()` for `label`, `dispatchMetadataTemplate`, or `agentId` to clear them.
## Inbound forwarding [#inbound-forwarding]
When a carrier forwards a call, Speko attempts to normalize the original forwarding source from provider fields and SIP headers. The value is available as `forwardedFromNumber` in call metadata and as `forwarded_from_number` in pre-call webhooks.
## Business verification [#business-verification]
Managed phone-number purchases are gated by phone-number KYB.
```ts
const kyb = await speko.phoneNumbers.getKyb();
await speko.phoneNumbers.submitKyb({
businessProfile: {
legalName: 'Acme Inc.',
displayName: 'Acme',
entityType: 'Corporation',
country: 'US',
website: 'https://example.com',
address: {
street: '1 Market St',
city: 'San Francisco',
state: 'CA',
postalCode: '94105',
country: 'US',
},
useCase: 'Customer support calls and appointment reminders.',
expectedUsage: '500 calls per month in the US.',
},
authorizedRepresentative: {
name: 'Ava Martinez',
title: 'Operations Lead',
email: 'ava@example.com',
phone: '+12015551234',
},
attestationAccepted: true,
});
```
# realtime (/sdk/realtime)
Speech-to-speech WebSocket sessions — OpenAI Realtime, Gemini Live, xAI Grok Voice, and Inworld.
Open a speech-to-speech (S2S) session. `speko.realtime.connect()` mints a short-lived WebSocket token via `POST /v1/sessions` and returns a handle connected directly to Speko's S2S proxy, which bridges to the underlying provider. The browser media transport is skipped entirely so time-to-first-audio stays under \~300 ms.
`Realtime` targets the browser `WebSocket` global. In Node 22+ you can polyfill with `ws` via `globalThis.WebSocket = (await import('ws')).WebSocket;` — but the typical deployment is a browser paired with `@spekoai/client` for mic capture.
```ts
import { Speko } from '@spekoai/sdk';
const speko = new Speko({ apiKey: import.meta.env.VITE_SPEKO_API_KEY });
const session = await speko.realtime.connect({
provider: 'openai',
model: 'gpt-realtime',
});
session.on((frame) => {
if (frame.type === 'audio') play(frame.pcm);
else if (frame.type === 'transcript') console.log(frame.role, frame.text);
});
session.sendAudio(pcm16Chunk);
// ... end of user turn
session.commit();
```
## `speko.realtime.connect(params)` [#spekorealtimeconnectparams]
### Signature [#signature]
```ts
speko.realtime.connect(
params: RealtimeConnectParams,
): Promise
```
### `RealtimeConnectParams` [#realtimeconnectparams]
| Field | Type | Description |
| ------------------ | -------------------------------------------- | ---------------------------------------------------------------------------------- |
| `provider` | `'openai' \| 'google' \| 'xai' \| 'inworld'` | S2S provider. |
| `model` | `string` | Provider-specific model id (e.g. `gpt-realtime`, `gemini-2.5-flash-native-audio`). |
| `voice` | `string?` | Voice id override — interpreted per provider. |
| `systemPrompt` | `string?` | Initial system instruction. |
| `temperature` | `number?` | |
| `inputSampleRate` | `16000 \| 24000?` | PCM rate you'll be sending. |
| `outputSampleRate` | `16000 \| 24000?` | PCM rate you want back. |
| `tools` | `RealtimeToolSpec[]?` | Tool definitions the assistant may call. |
| `metadata` | `Record?` | Free-form metadata attached to the session record. |
| `ttlSeconds` | `number?` | Max session duration. Server-capped at 1800 (30 min). |
## `RealtimeSessionHandle` [#realtimesessionhandle]
| Property | Type | Description |
| ------------------ | ---------------- | -------------------------------- |
| `sessionId` | `string` | Server-assigned session id. |
| `expiresAt` | `string` | ISO-8601 expiry of the WS token. |
| `inputSampleRate` | `16000 \| 24000` | PCM rate the session accepts. |
| `outputSampleRate` | `16000 \| 24000` | PCM rate the session returns. |
### Methods [#methods]
| Method | Description |
| -------------------------------------- | ---------------------------------------------------------------- |
| `sendAudio(pcm: Uint8Array): void` | Ship a PCM16 audio chunk up to the model. |
| `commit(): void` | Signal end-of-user-turn; server flushes buffered audio upstream. |
| `interrupt(): void` | Cancel the assistant's in-flight response. |
| `sendToolResult(callId, output): void` | Return the result of a previously-issued `tool_call`. |
| `on(handler): () => void` | Subscribe to frames. Returns an unsubscribe callback. |
| `close(code?, reason?): void` | Close the socket. Idempotent. |
### `RealtimeFrame` variants [#realtimeframe-variants]
```ts
type RealtimeFrame =
| { type: 'ready'; inputSampleRate: 16000 | 24000; outputSampleRate: 16000 | 24000 }
| { type: 'audio'; pcm: Uint8Array; sampleRate: number }
| { type: 'transcript'; role: 'user' | 'assistant'; text: string; final: boolean }
| { type: 'tool_call'; callId: string; name: string; arguments: string }
| { type: 'usage'; inputAudioTokens: number; outputAudioTokens: number }
| { type: 'interruption'; at: 'user' | 'assistant' }
| { type: 'server_tool_call'; id: string; name: string; status: 'started' | 'completed' | 'failed' }
| { type: 'error'; code: string; message: string }
| { type: 'close'; code: number; reason: string };
```
## Example — tool calls [#example--tool-calls]
```ts
const session = await speko.realtime.connect({
provider: 'openai',
model: 'gpt-realtime',
tools: [
{
name: 'get_weather',
description: 'Current weather for a city.',
parameters: {
type: 'object',
properties: { city: { type: 'string' } },
required: ['city'],
},
},
],
});
session.on(async (frame) => {
if (frame.type === 'tool_call' && frame.name === 'get_weather') {
const { city } = JSON.parse(frame.arguments);
const result = await fetchWeather(city);
session.sendToolResult(frame.callId, result);
}
});
```
## Transport notes [#transport-notes]
* **Auth via subprotocol.** The WS token is passed as the first WebSocket subprotocol. Browsers can't set headers on `new WebSocket()`, so subprotocol is the only auth carrier that doesn't leak through URL params.
* **Binary type.** The SDK forces `binaryType = 'arraybuffer'`; inbound audio arrives as `Uint8Array` over an `ArrayBuffer`.
* **Missing PCM.** Until you call `sendAudio` the upstream provider sees no user input. Hook up a Web Audio `AudioWorklet` capture on the client side, or use `@spekoai/client` which handles capture for you over the browser media transport.
# Speko client (/sdk/speko-client)
Construct the SDK client and configure transport options.
The `Speko` class is the single entry point. Construct it once per process and reuse it — HTTP keep-alive, auth headers, and timeout state are all bound to the instance.
```ts
import { Speko } from '@spekoai/sdk';
const speko = new Speko({
apiKey: process.env.SPEKO_API_KEY!,
baseUrl: 'https://api.speko.dev', // optional
timeout: 30_000, // optional, ms
});
```
## `new Speko(options)` [#new-spekooptions]
### `SpekoClientOptions` [#spekoclientoptions]
| Field | Type | Default | Description |
| --------- | --------- | ----------------------- | ----------------------------------------------------------------------------------------------------------- |
| `apiKey` | `string` | — (required) | API key. Get one at [platform.speko.dev/api-keys](https://platform.speko.dev/api-keys). Throws if missing. |
| `baseUrl` | `string?` | `https://api.speko.dev` | Override the proxy base URL — useful for self-hosted deploys or local development. |
| `timeout` | `number?` | `30000` | Per-request timeout in milliseconds. Applied via an internal `AbortController` composed with external ones. |
A trailing slash on `baseUrl` is stripped automatically.
### Instance properties [#instance-properties]
* `speko.usage` — a [`Usage`](/sdk/usage) resource for billing queries.
* `speko.credits` — a [`Credits`](/sdk/credits) resource for balance and ledger queries.
* `speko.realtime` — a [`Realtime`](/sdk/realtime) resource for opening speech-to-speech sessions.
* `speko.voice` — outbound phone-call helpers backed by [`/v1/sessions/phone`](/sdk/voice).
* `speko.phoneNumbers` — provision, list, update, release, and verify organization phone numbers.
* `speko.agents` — manage persisted voice personas and their registered tools.
* `speko.calls` — inspect calls, events, reports, recordings, and transfers.
* `speko.callbacks` — list, cancel, and dispatch scheduled callbacks.
* `speko.knowledgeBases` — manage per-agent knowledge bases and document uploads.
### Instance methods [#instance-methods]
* `speko.transcribe(audio, options, abortSignal?)` — see [transcribe](/sdk/transcribe).
* `speko.transcribeStream(audio, options, abortSignal?)` — transcript SSE events.
* `speko.synthesize(text, options, abortSignal?)` — see [synthesize](/sdk/synthesize).
* `speko.synthesizeStream(text, options, abortSignal?)` — chunked audio bytes.
* `speko.complete(params, abortSignal?)` — see [complete](/sdk/complete).
* `speko.completeStream(params, abortSignal?)` — completion SSE events.
All proxy methods accept a trailing `AbortSignal` and compose it with the client's timeout.
## Authentication [#authentication]
Requests send `Authorization: Bearer `. The SDK sets a package `User-Agent` on every request. Non-2xx responses are parsed as `{ error, code }` JSON when possible and re-thrown as [`SpekoApiError`](/sdk/errors) (or its `SpekoAuthError` / `SpekoRateLimitError` subclass).
## Concurrency [#concurrency]
The client is safe to share across concurrent requests — it holds no per-call mutable state. The internal `fetch` calls are fully independent; each builds its own `AbortController`.
# synthesize (/sdk/synthesize)
POST /v1/synthesize — text-to-speech with automatic provider routing.
Synthesize text into audio. The router picks the best TTS provider for your `(language, region, optimizeFor)` and fails over automatically.
```ts
const result = await speko.synthesize('Hello world', {
language: 'en',
});
```
## Signature [#signature]
```ts
speko.synthesize(
text: string,
options: SynthesizeOptions,
abortSignal?: AbortSignal,
): Promise
speko.synthesizeStream(
text: string,
options: SynthesizeOptions,
abortSignal?: AbortSignal,
): Promise
```
## Parameters [#parameters]
### `text: string` [#text-string]
The text to synthesize. The server-side cap is **50,000 characters per call** (raised from 10,000 to handle long-form audiobook / podcast content). The upstream provider may still apply its own limit; if you need longer than 50K, chunk the script and call `synthesize` per chunk.
### `options: SynthesizeOptions` [#options-synthesizeoptions]
Extends [`RoutingIntent`](/sdk/types#routingintent):
| Field | Type | Description |
| ------------- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `language` | `string` (BCP-47) | e.g. `"en"`, `"es-MX"`. |
| `region` | `string?` | Region to rank streaming providers in. Defaults to `global` server-side. |
| `optimizeFor` | `OptimizeFor?` | `balanced`, `accuracy`, `latency`, `cost`. |
| `voice` | `string?` | Voice id override. The router interprets it per provider (e.g. a Cartesia voice UUID). Browse the catalog with [`speko.voices.list()`](/sdk/voices). |
| `model` | `string?` | Upstream model name (e.g. `eleven_multilingual_v2`, `sonic-2`, `gpt-4o-mini-tts`, `qwen3-tts-flash`). Overrides the selector's choice on the primary candidate only — failover candidates use the selector's model so a model intended for provider A isn't sent to provider B. |
| `speed` | `number?` | Speech speed multiplier. Providers vary in what range they accept — `1.0` is always neutral. |
| `constraints` | `PipelineConstraints?` | Allow-list constraints. |
### `abortSignal?: AbortSignal` [#abortsignal-abortsignal]
Cancel an in-flight request.
## Returns [#returns]
### `SynthesizeResult` [#synthesizeresult]
| Field | Type | Description |
| --------------- | ---------------- | ------------------------------------------------------------------------------------ |
| `audio` | `Uint8Array` | Raw audio bytes. Format depends on the chosen provider — always check `contentType`. |
| `contentType` | `string` | MIME type. ElevenLabs returns `audio/mpeg`. Cartesia returns `audio/pcm;rate=24000`. |
| `provider` | `string` | Upstream provider that ran the request. |
| `model` | `string` | Provider-specific model identifier (e.g. voice model name). |
| `failoverCount` | `number` | Providers tried before this one succeeded. |
| `scoresRunId` | `string \| null` | Scoring run id that selected this provider. |
## Wire format [#wire-format]
The SDK sends `POST /v1/synthesize` with a JSON body:
```json
{
"text": "Hello world",
"intent": { "language": "en", "region": "global", "optimizeFor": "latency" },
"voice": "…",
"speed": 1.0,
"constraints": { "allowedProviders": { "tts": ["cartesia"] } }
}
```
The response is chunked binary audio. `provider`, `model`, `failoverCount`, and
`scoresRunId` are parsed from response headers (`X-Speko-Provider`,
`X-Speko-Model`, `X-Speko-Failover-Count`, `X-Speko-Scores-Run-Id`).
`speko.synthesize()` consumes the chunks into one `Uint8Array`; use
`speko.synthesizeStream()` to handle chunks as they arrive.
## Example: write to disk [#example-write-to-disk]
```ts
import { writeFile } from 'node:fs/promises';
const result = await speko.synthesize('Welcome to the clinic.', {
language: 'en',
voice: 'sonic-english',
});
const ext = result.contentType.includes('mpeg')
? 'mp3'
: result.contentType.includes('pcm')
? 'pcm'
: 'bin';
await writeFile(`greeting.${ext}`, result.audio);
```
## Example: pin a provider for deterministic output [#example-pin-a-provider-for-deterministic-output]
```ts
await speko.synthesize('…', {
language: 'en',
constraints: { allowedProviders: { tts: ['cartesia'] } },
});
```
## Example: pin a specific model [#example-pin-a-specific-model]
Useful for benchmarking (e.g. `eleven_v3` vs `eleven_multilingual_v2`) or for
long-form runs where you want to lock in a particular model's stability profile:
```ts
await speko.synthesize('…', {
language: 'en',
constraints: { allowedProviders: { tts: ['elevenlabs'] } },
model: 'eleven_multilingual_v2',
});
```
## Format gotchas [#format-gotchas]
The return type depends on the provider Speko picks. If your downstream consumer only handles PCM (e.g. [`@spekoai/adapter-livekit`](/adapter-livekit/speko-tts) v1), either pin a PCM provider via `constraints` or branch on `contentType` before you decode.
# transcribe (/sdk/transcribe)
POST /v1/transcribe — speech-to-text with automatic provider routing.
Transcribe an audio payload. The router picks the best STT provider for your `(language, region, optimizeFor)` and fails over automatically.
```ts
const { text, provider, confidence } = await speko.transcribe(audio, {
language: 'es-MX',
});
```
## Signature [#signature]
```ts
speko.transcribe(
audio: Uint8Array,
options: TranscribeOptions,
abortSignal?: AbortSignal,
): Promise
speko.transcribeStream(
audio: Uint8Array,
options: TranscribeOptions,
abortSignal?: AbortSignal,
): AsyncIterable
```
## Parameters [#parameters]
### `audio: Uint8Array` [#audio-uint8array]
Raw audio bytes. Default MIME is `audio/wav`; set `options.contentType` if you send something else. Providers handle resampling and format conversion downstream — you don't have to match a specific sample rate.
### `options: TranscribeOptions` [#options-transcribeoptions]
Extends [`RoutingIntent`](/sdk/types#routingintent):
| Field | Type | Description |
| ------------- | -------------------------------------------------- | ------------------------------------------------------------------------------------ |
| `language` | `string` (BCP-47) | e.g. `"en"`, `"es-MX"`, `"ja-JP"`. |
| `region` | `string?` | Region to rank streaming providers in. Defaults to `global` server-side. |
| `optimizeFor` | `'balanced' \| 'accuracy' \| 'latency' \| 'cost'?` | Bias the weighted score. Defaults to the server-side default (currently `balanced`). |
| `contentType` | `string?` | MIME type for the body. Defaults to `audio/wav`. |
| `constraints` | `PipelineConstraints?` | Allow-list constraints (see [Types](/sdk/types#pipelineconstraints)). |
| `keywords` | `readonly string[]?` | Domain words and proper nouns to bias STT output toward. |
### `abortSignal?: AbortSignal` [#abortsignal-abortsignal]
Cancel an in-flight request. Composed with the client-level timeout.
## Returns [#returns]
### `TranscribeResult` [#transcriberesult]
| Field | Type | Description |
| --------------- | ---------------- | ----------------------------------------------------------------------------------------- |
| `text` | `string` | Transcribed text. |
| `provider` | `string` | Upstream provider that ran the request (e.g. `deepgram`, `openai`). |
| `model` | `string` | Provider-specific model identifier. |
| `confidence` | `number \| null` | Model-reported confidence when available, else `null`. |
| `failoverCount` | `number` | How many providers were tried before this one succeeded. |
| `scoresRunId` | `string \| null` | ID of the scoring run that selected this provider — useful for joining to benchmark data. |
## Wire format [#wire-format]
The SDK sends the audio as the raw HTTP body with:
* `Content-Type`: from `options.contentType` (default `audio/wav`).
* `X-Speko-Intent`: JSON-serialized `{ language, region?, optimizeFor? }`.
* `X-Speko-Constraints`: JSON-serialized `options.constraints` (only if set).
* `X-Speko-Stt-Options`: JSON-serialized `{ keywords }` (only if keywords are set).
The wire response is `text/event-stream` with `meta`, `transcript`, `done`,
and `error` events. `speko.transcribe()` consumes that stream and returns the
final `TranscribeResult`; use `speko.transcribeStream()` to receive partial
transcripts directly.
## Example: non-default MIME [#example-non-default-mime]
```ts
import { readFile } from 'node:fs/promises';
const audio = await readFile('./call.ogg');
const result = await speko.transcribe(audio, {
language: 'en',
contentType: 'audio/ogg',
optimizeFor: 'accuracy',
});
```
## Example: restrict provider pool [#example-restrict-provider-pool]
```ts
const result = await speko.transcribe(audio, {
language: 'en',
constraints: {
allowedProviders: { stt: ['deepgram', 'assemblyai'] },
},
});
```
The router still ranks candidates by benchmark score and only picks from the allow-list.
## Example: bias proper nouns [#example-bias-proper-nouns]
```ts
const result = await speko.transcribe(audio, {
language: 'en',
keywords: ['Speko', 'Ava Martinez', 'Cartesia'],
});
```
# Types (/sdk/types)
Shared request / response types exported from @spekoai/sdk.
All types below are exported from the package root:
```ts
import type {
SpekoClientOptions,
KeySource,
OptimizeFor,
RoutingIntent,
PipelineConstraints,
TranscribeOptions,
TranscribeResult,
TranscribeStreamEvent,
SynthesizeOptions,
SynthesizeResult,
SynthesizeStreamResult,
ChatMessage,
ChatTool,
ChatToolCall,
ChatToolChoice,
CompleteParams,
CompleteResult,
CompleteStreamEvent,
UsageSummary,
UsageByProvider,
UsageQueryParams,
OrganizationBalance,
CreditLedgerEntry,
CreditLedgerKind,
CreditLedgerPage,
CreditLedgerQueryParams,
RealtimeProvider,
RealtimeToolSpec,
RealtimeConnectParams,
RealtimeFrame,
RealtimeEventHandler,
RealtimeSessionHandle,
VoiceDialParams,
VoiceDialResult,
PhoneNumberRow,
PhoneNumberKybOverview,
PhoneNumberKybSubmission,
CallDetail,
CallEvent,
CallReport,
CallTransfer,
ScheduledCallback,
AgentCallListPage,
AgentRow,
AgentCreateParams,
AgentToolRow,
KnowledgeBaseRow,
KnowledgeBaseDocumentRow,
} from '@spekoai/sdk';
```
## Routing primitives [#routing-primitives]
### `OptimizeFor` [#optimizefor]
```ts
type OptimizeFor = 'balanced' | 'accuracy' | 'latency' | 'cost';
```
Biases the weighted score the router uses to rank candidates. `balanced` is the default.
### `RoutingIntent` [#routingintent]
```ts
interface RoutingIntent {
language: string; // BCP-47, e.g. "en" or "es-MX"
region?: string; // e.g. "us-east4"; defaults to "global" server-side
optimizeFor?: OptimizeFor;
}
```
Every proxy call takes one. `TranscribeOptions` and `SynthesizeOptions` extend it directly.
### `PipelineConstraints` [#pipelineconstraints]
```ts
interface PipelineConstraints {
allowedProviders?: {
stt?: string[];
llm?: string[];
tts?: string[];
};
}
```
Optional allow-list layered on top of `RoutingIntent`. When set, the router still ranks by benchmark score but considers only candidates in the allow-list for that modality.
## Transcribe [#transcribe]
```ts
interface TranscribeOptions extends RoutingIntent {
contentType?: string; // default "audio/wav"
constraints?: PipelineConstraints;
keywords?: readonly string[];
}
interface TranscribeResult {
text: string;
provider: string;
model: string;
confidence: number | null;
failoverCount: number;
scoresRunId: string | null;
}
```
## Synthesize [#synthesize]
```ts
interface SynthesizeOptions extends RoutingIntent {
voice?: string;
speed?: number;
constraints?: PipelineConstraints;
}
interface SynthesizeResult {
audio: Uint8Array;
contentType: string; // e.g. "audio/pcm;rate=24000" or "audio/mpeg"
provider: string;
model: string;
failoverCount: number;
scoresRunId: string | null;
}
```
## Complete [#complete]
```ts
interface ChatMessage {
role: 'system' | 'user' | 'assistant' | 'tool';
content: string;
toolCalls?: ChatToolCall[];
toolCallId?: string;
isError?: boolean;
}
interface ChatToolCall {
id: string;
name: string;
args: string;
}
type ChatToolChoice =
| 'auto'
| 'none'
| 'required'
| { type: 'function'; function: { name: string } };
interface ChatTool {
name: string;
description: string;
parameters: Record;
executionMode?: 'inline' | 'webhook' | 'builtin';
source?:
| { kind: 'inline' }
| {
kind: 'webhook';
url: string;
secretRef: string;
headers?: Record;
timeoutMs?: number;
}
| { kind: 'builtin'; name: string; config?: unknown };
}
interface CompleteParams {
messages: ChatMessage[];
intent: RoutingIntent;
systemPrompt?: string;
temperature?: number;
maxTokens?: number;
reasoningEffort?: 'none' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh';
constraints?: PipelineConstraints;
tools?: ChatTool[];
toolChoice?: ChatToolChoice;
parallelToolCalls?: boolean;
maxToolHops?: number;
}
interface CompleteResult {
text: string;
provider: string;
model: string;
usage: {
promptTokens: number;
completionTokens: number;
};
failoverCount: number;
scoresRunId: string | null;
toolCalls?: ChatToolCall[];
}
```
## Usage [#usage]
```ts
type KeySource = 'BYOK' | 'MANAGED';
interface UsageQueryParams {
from?: string; // ISO-8601
to?: string; // ISO-8601
}
interface UsageSummary {
totalSessions: number;
totalMinutes: number;
totalCost: number;
breakdown: UsageByProvider[];
balanceUsd: number;
currency: 'USD';
}
interface UsageByProvider {
provider: string;
type: 'stt' | 'llm' | 'tts';
metric: string;
keySource: KeySource;
quantity: number;
cost: number;
}
```
## Credits [#credits]
```ts
interface OrganizationBalance {
balanceUsd: number;
currency: 'USD';
updatedAt: string;
}
type CreditLedgerKind = 'grant' | 'debit' | 'topup' | 'refund' | 'adjustment';
interface CreditLedgerEntry {
id: string;
kind: CreditLedgerKind;
/** Signed. Positive for grants/topups/refunds, negative for debits. */
amountMicroUsd: string;
metric: string | null;
provider: string | null;
sessionId: string | null;
createdAt: string;
}
interface CreditLedgerPage {
entries: CreditLedgerEntry[];
nextCursor: string | null;
}
interface CreditLedgerQueryParams {
limit?: number;
cursor?: string;
}
```
## Realtime (S2S) [#realtime-s2s]
```ts
type RealtimeProvider = 'openai' | 'google' | 'xai' | 'inworld';
interface RealtimeToolSpec {
name: string;
description: string;
parameters: Record;
}
interface RealtimeConnectParams {
provider: RealtimeProvider;
model: string;
voice?: string;
systemPrompt?: string;
temperature?: number;
inputSampleRate?: 16000 | 24000;
outputSampleRate?: 16000 | 24000;
tools?: RealtimeToolSpec[];
metadata?: Record;
/** Max session duration in seconds. Server-capped at 1800 (30 min). */
ttlSeconds?: number;
}
type RealtimeFrame =
| { type: 'ready'; inputSampleRate: 16000 | 24000; outputSampleRate: 16000 | 24000 }
| { type: 'audio'; pcm: Uint8Array; sampleRate: number }
| {
type: 'transcript';
role: 'user' | 'assistant';
text: string;
final: boolean;
}
| {
type: 'tool_call';
callId: string;
name: string;
arguments: string;
}
| {
type: 'usage';
inputAudioTokens: number;
outputAudioTokens: number;
}
| { type: 'interruption'; at: 'user' | 'assistant' }
| {
type: 'server_tool_call';
id: string;
name: string;
status: 'started' | 'completed' | 'failed';
}
| { type: 'error'; code: string; message: string }
| { type: 'close'; code: number; reason: string };
type RealtimeEventHandler = (frame: RealtimeFrame) => void;
interface RealtimeSessionHandle {
readonly sessionId: string;
readonly expiresAt: string;
readonly inputSampleRate: 16000 | 24000;
readonly outputSampleRate: 16000 | 24000;
sendAudio(pcm: Uint8Array): void;
commit(): void;
interrupt(): void;
sendToolResult(callId: string, output: string): void;
on(handler: RealtimeEventHandler): () => void;
close(code?: number, reason?: string): void;
}
```
## Voice, Phone Numbers, Agents, and Knowledge Bases [#voice-phone-numbers-agents-and-knowledge-bases]
These resource types are also exported from the package root. The full shapes live in TypeScript and are intended to be consumed from your editor, but the high-level entry points are:
```ts
interface VoiceDialParams {
to: string;
from?: string;
intent: RoutingIntent;
constraints?: PipelineConstraints;
voice?: string;
systemPrompt?: string;
llm?: { temperature?: number; maxTokens?: number };
ttsOptions?: { sampleRate?: number; speed?: number };
metadata?: Record;
}
interface VoiceDialResult {
sessionId: string;
callControlId: string;
roomName: string;
status: 'dialing' | 'dialing-stub';
to: string;
from: string;
}
interface PhoneNumberRow {
id: string;
e164: string;
direction: 'inbound' | 'outbound' | 'both';
label: string | null;
agentId: string | null;
createdAt: string;
updatedAt: string;
}
interface AgentRow {
id: string;
name: string;
systemPrompt: string;
voice: string | null;
intent: { language: string; optimizeFor?: 'latency' | 'quality' | 'cost' };
llmOptions: { temperature?: number; maxTokens?: number; model?: string } | null;
stackPreferences: {
allowedProviders?: { stt?: string[]; llm?: string[]; tts?: string[]; s2s?: string[] };
} | null;
sttOptions: { keywords?: string[] } | null;
createdAt: string;
updatedAt: string;
}
interface KnowledgeBaseRow {
id: string;
agentId: string;
name: string;
description: string | null;
embeddingModel: string;
documentCount: number;
chunkCount: number;
createdAt: string;
updatedAt: string;
}
```
## Client [#client]
```ts
interface SpekoClientOptions {
apiKey: string;
baseUrl?: string; // default "https://api.speko.dev"
timeout?: number; // default 30000
}
```
# usage (/sdk/usage)
GET /v1/usage — billing and usage reporting.
Get a usage summary for a billing period. Exposed via `speko.usage`.
```ts
const usage = await speko.usage.get();
console.log(usage.totalSessions, usage.totalMinutes, usage.totalCost);
```
## `speko.usage.get(params?)` [#spekousagegetparams]
### Signature [#signature]
```ts
speko.usage.get(params?: UsageQueryParams): Promise
```
### `UsageQueryParams` [#usagequeryparams]
| Field | Type | Description |
| ------ | --------- | ------------------------------------------------------------------ |
| `from` | `string?` | ISO-8601 start date. Omit to default to the current billing start. |
| `to` | `string?` | ISO-8601 end date. Omit to default to now. |
### Returns — `UsageSummary` [#returns--usagesummary]
| Field | Type | Description |
| --------------- | ------------------- | ------------------------------------------ |
| `totalSessions` | `number` | Distinct sessions in range. |
| `totalMinutes` | `number` | Total audio minutes billed. |
| `totalCost` | `number` | Total cost in USD. |
| `breakdown` | `UsageByProvider[]` | Per-provider rollup (see below). |
| `balanceUsd` | `number` | Current organization balance in USD. |
| `currency` | `'USD'` | Currency for `balanceUsd` and `totalCost`. |
### `UsageByProvider` [#usagebyprovider]
| Field | Type | Description |
| ----------- | ------------------------- | -------------------------------------------------------------------------------- |
| `provider` | `string` | Upstream provider id. |
| `type` | `'stt' \| 'llm' \| 'tts'` | Modality. |
| `metric` | `string` | The billable metric (e.g. `minutes`, `characters`, `tokens`). |
| `keySource` | `'BYOK' \| 'MANAGED'` | `BYOK` = customer key, no Speko margin. `MANAGED` = platform key, billed to org. |
| `quantity` | `number` | Billed quantity in the metric's unit. |
| `cost` | `number` | Cost in USD. |
## Example: last 24 hours [#example-last-24-hours]
```ts
const now = new Date();
const yesterday = new Date(now.getTime() - 24 * 60 * 60 * 1000);
const usage = await speko.usage.get({
from: yesterday.toISOString(),
to: now.toISOString(),
});
for (const row of usage.breakdown) {
console.log(`${row.type}\t${row.provider}\t${row.quantity}${row.metric}\t$${row.cost}`);
}
```
# voice (/sdk/voice)
Outbound phone calls through POST /v1/sessions/phone.
Use `speko.voice.dial()` to place an outbound PSTN call. The API creates a voice session, dispatches the configured worker, and dials the destination over LiveKit SIP.
```ts
const call = await speko.voice.dial({
to: '+12015551234',
from: '+12015550199',
agentId: 'agent_123',
telephony: {
region: 'us-east',
amd: { mode: 'agent', timeoutSeconds: 8 },
},
});
console.log(call.sessionId, call.status);
```
## Signature [#signature]
```ts
speko.voice.dial(params: VoiceDialParams): Promise
```
## Parameters [#parameters]
| Field | Type | Description |
| -------------- | ---------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- |
| `to` | `string` | Destination in E.164 format. |
| `from` | `string?` | Caller ID. Must be registered to the organization unless it is the server default. |
| `agentId` | `string?` | Persisted agent to run. Required unless `intent` is supplied. |
| `intent` | `RoutingIntent?` | Routing intent for ad hoc calls. Required unless `agentId` is supplied. |
| `constraints` | `PipelineConstraints?` | Provider allow-list constraints. |
| `voice` | `string?` | TTS voice id override. |
| `systemPrompt` | `string?` | Agent instructions for this call. |
| `firstMessage` | `string?` | Optional first utterance. |
| `llm` | `{ temperature?: number; maxTokens?: number }?` | LLM tuning. |
| `ttsOptions` | `{ sampleRate?: number; speed?: number }?` | TTS options. |
| `sttOptions` | `{ keywords?: string[] }?` | STT keyword hints. |
| `telephony` | `{ region?: string; amd?: { mode?: 'agent' \| 'carrier' \| 'disabled'; timeoutSeconds?: number } }?` | SIP routing and voicemail-detection hints. |
| `metadata` | `Record?` | Free-form metadata surfaced in webhooks and call events. |
## Returns [#returns]
| Field | Type | Description |
| --------------- | ----------------------------- | ----------------------------------------------------------------------- |
| `sessionId` | `string` | Voice session id. |
| `callControlId` | `string` | LiveKit SIP participant identity for the outbound leg. |
| `roomName` | `string` | LiveKit room name. |
| `status` | `'dialing' \| 'dialing-stub'` | Dial status. Stub appears when SIP is not configured in the deployment. |
| `to` | `string` | Destination number. |
| `from` | `string` | Resolved caller ID. |
# voices (/sdk/voices)
GET /v1/voices — read-only catalog of TTS voices grouped by provider.
Browse the curated voice catalog before calling `synthesize`. Handy for picking
a voice id without reading every TTS provider's API docs.
```ts
const { voices, providers } = await speko.voices.list();
```
## Signature [#signature]
```ts
speko.voices.list(params?: VoicesListParams): Promise
```
## Parameters [#parameters]
### `params.provider: string?` [#paramsprovider-string]
Filter to a single provider's voices. Accepts either the routing key
(`cartesia`, `xai`, `alibaba`, `openai`, `inworld`, `elevenlabs`) or the catalog
suffix form (`xai-tts`, `alibaba-tts`, `openai-tts`).
## Returns [#returns]
### `VoicesListResult` [#voiceslistresult]
| Field | Type | Description |
| ----------- | ----------------------- | ------------------------------------------------------ |
| `voices` | `VoiceCatalogEntry[]` | Curated voice roster across the included providers. |
| `providers` | `VoicesProviderEntry[]` | TTS providers Speko routes to, with their model lists. |
### `VoiceCatalogEntry` [#voicecatalogentry]
| Field | Type | Description |
| -------- | -------- | ------------------------------------------------------------ |
| `vendor` | `string` | Routing-key vendor (matches `allowedProviders.tts` entries). |
| `id` | `string` | Voice id forwarded verbatim to the provider's TTS API. |
| `name` | `string` | Human-readable label, e.g. `"Katie (American female)"`. |
### `VoicesProviderEntry` [#voicesproviderentry]
| Field | Type | Description |
| ------------------- | ---------- | --------------------------------------------------------------------------------------- |
| `key` | `string` | Catalog key, e.g. `cartesia`, `elevenlabs`, `alibaba-tts`. |
| `name` | `string` | Human-readable provider name. |
| `models` | `string[]` | Models available for the `model` field on `synthesize`. |
| `voicesFetchedLive` | `boolean` | `true` when the provider's voice library is account-scoped (currently only ElevenLabs). |
## ElevenLabs [#elevenlabs]
ElevenLabs voices are account-scoped, so they are **not** included in
`voices` — the corresponding `providers` entry sets `voicesFetchedLive: true`
as a hint to fetch them directly from
[`https://api.elevenlabs.io/v1/voices`](https://elevenlabs.io/docs/api-reference/voices/get-all)
with the org's key.
## Example: pick a voice and synthesize [#example-pick-a-voice-and-synthesize]
```ts
const { voices } = await speko.voices.list({ provider: 'cartesia' });
const sonia = voices.find((v) => v.name.startsWith('Sonia')) ?? voices[0]!;
await speko.synthesize('Hello world', {
language: 'en',
voice: sonia.id,
constraints: { allowedProviders: { tts: ['cartesia'] } },
});
```
# complete (/sdk-python/complete)
POST /v1/complete — LLM completion with automatic provider routing.
Run a single-shot LLM completion. The router picks the best LLM provider for your intent and fails over automatically.
```python
from spekoai import ChatMessage, RoutingIntent
reply = speko.complete(
messages=[ChatMessage(role="user", content="Hi!")],
intent=RoutingIntent(language="en"),
)
print(reply.text, reply.provider, reply.usage.prompt_tokens)
```
## Signature [#signature]
```python sync
Speko.complete(
*,
messages: list[ChatMessage | dict],
intent: RoutingIntent | dict,
system_prompt: str | None = None,
temperature: float | None = None,
max_tokens: int | None = None,
constraints: PipelineConstraints | dict | None = None,
) -> CompleteResult
```
```python async
await AsyncSpeko.complete(
*,
messages: list[ChatMessage | dict],
intent: RoutingIntent | dict,
system_prompt: str | None = None,
temperature: float | None = None,
max_tokens: int | None = None,
constraints: PipelineConstraints | dict | None = None,
) -> CompleteResult
```
## Parameters [#parameters]
Conversation history. Roles: `system`, `user`, `assistant`. Dicts are validated against `ChatMessage` on the way in.
Routing intent — `language`, optional `optimize_for`.
Shortcut for a leading system message. Providers with a native system channel use it directly; others fold it into the message list.
Forwarded to the provider. Omit to use the provider's default.
Max completion tokens. Omit to use the provider's default.
## Returns — `CompleteResult` [#returns--completeresult]
Assistant reply.
`/v1/complete` streams over the wire. The Python SDK consumes the stream and
returns the final `CompleteResult`; explicit Python streaming helpers are not
exposed yet.
## Example — multi-turn [#example--multi-turn]
```python
messages = [
{"role": "system", "content": "You are a concise voice assistant."},
{"role": "user", "content": "Book me an appointment for Tuesday."},
]
first = speko.complete(
messages=messages,
intent={"language": "en"},
temperature=0.3,
max_tokens=200,
)
messages.append({"role": "assistant", "content": first.text})
messages.append({"role": "user", "content": "3pm, with Dr. Chen."})
second = speko.complete(
messages=messages,
intent={"language": "en"},
)
```
## Example — pin a provider [#example--pin-a-provider]
```python
from spekoai import AllowedProviders, PipelineConstraints
reply = speko.complete(
messages=[{"role": "user", "content": "…"}],
intent={"language": "en"},
constraints=PipelineConstraints(
allowed_providers=AllowedProviders(llm=["anthropic"]),
),
)
```
# credits (/sdk-python/credits)
Prepaid credit balance and append-only ledger.
Query the organization's prepaid credit balance and walk the ledger of every credit movement (grants, debits, topups, refunds, adjustments). Exposed via `speko.credits`.
Balances are returned in USD. Ledger amounts still use micro-USD strings because ledger entries are signed accounting units.
## `get_balance` [#get_balance]
```python sync
Speko.credits.get_balance() -> OrganizationBalance
```
```python async
await AsyncSpeko.credits.get_balance() -> OrganizationBalance
```
Returns the current balance for the caller's organization.
```python
balance = speko.credits.get_balance()
if balance.balance_usd < 0.5:
print("Top up before running long sessions.")
```
### `OrganizationBalance` [#organizationbalance]
ISO-8601 timestamp of the last ledger event.
## `get_ledger` [#get_ledger]
```python sync
Speko.credits.get_ledger(
*,
limit: int | None = None,
cursor: str | None = None,
) -> CreditLedgerPage
```
```python async
await AsyncSpeko.credits.get_ledger(
*,
limit: int | None = None,
cursor: str | None = None,
) -> CreditLedgerPage
```
Most-recent-first page of credit movements. Pass `next_cursor` from one response back in as `cursor` to fetch the next page; `next_cursor is None` means the history is exhausted.
```python
page = speko.credits.get_ledger(limit=50)
while True:
for entry in page.entries:
print(entry.created_at, entry.kind, entry.amount_micro_usd, entry.provider)
if page.next_cursor is None:
break
page = speko.credits.get_ledger(limit=50, cursor=page.next_cursor)
```
### `CreditLedgerEntry` [#creditledgerentry]
Signed — positive for grants/topups/refunds, negative for debits. String-encoded to survive JSON for values beyond 2^53.
Billable metric when the entry ties to a specific usage row.
Session the debit was applied against.
### `CreditLedgerPage` [#creditledgerpage]
Cursor for the next page, or `None` when history is exhausted.
# Errors (/sdk-python/errors)
Typed exceptions raised by the Python SDK.
Every non-2xx API response is raised as a typed exception. Import from the package root:
```python
from spekoai import SpekoApiError, SpekoAuthError, SpekoRateLimitError
```
| Exception | When | Attributes |
| --------------------- | ----------------- | -------------------------------------------------------------------------- |
| `SpekoAuthError` | HTTP `401` | `message`, `status=401`, `code="AUTH_ERROR"` |
| `SpekoRateLimitError` | HTTP `429` | `message`, `status=429`, `code="RATE_LIMITED"`, `retry_after: int \| None` |
| `SpekoApiError` | any other non-2xx | `message`, `status`, `code` |
`message` and `code` are parsed from the JSON error body when present (`{"error": "...", "code": "..."}`); otherwise they fall back to `response.text` and `"UNKNOWN"`.
## Example — targeted handling [#example--targeted-handling]
```python
import time
from spekoai import Speko, SpekoApiError, SpekoAuthError, SpekoRateLimitError
speko = Speko(api_key="sk_live_...")
try:
speko.complete(
messages=[{"role": "user", "content": "Hi"}],
intent={"language": "en"},
)
except SpekoAuthError:
raise
except SpekoRateLimitError as err:
time.sleep(err.retry_after or 1)
except SpekoApiError as err:
log.exception("speko call failed: %s (%s)", err.code, err.status)
```
`SpekoAuthError` and `SpekoRateLimitError` both inherit from `SpekoApiError`, so a bare `except SpekoApiError` catches all three. Always branch from most specific to most general.
## Realtime errors [#realtime-errors]
`connect_realtime` raises the same three exceptions from the initial `POST /v1/sessions` call. Once the WebSocket is open, transport failures surface as frames with `type == "error"`:
```python
async for frame in session:
if frame["type"] == "error":
log.error("realtime: %s — %s", frame["code"], frame["message"])
break
```
# spekoai (Python) (/sdk-python/overview)
Official Python SDK for the Speko voice gateway — sync + async.
`spekoai` is the Python counterpart to `@spekoai/sdk`. The surface mirrors the TypeScript client one-to-one: `transcribe`, `synthesize`, `complete`, `usage`, `credits`, plus an async-only `connect_realtime` for speech-to-speech sessions.
Speko benchmarks every STT, LLM, and TTS provider per `(language, optimize_for)` and routes each call to the best one in real time. Failover is server-side — you ship one integration; providers rotate without a code change.
## Install [#install]
```bash
pip install spekoai
# or
uv add spekoai
```
Python 3.9+. Depends on `httpx`, `pydantic`, and `websockets`.
## Quickstart [#quickstart]
```python sync
import os
from pathlib import Path
from spekoai import Speko
speko = Speko(api_key=os.environ["SPEKO_API_KEY"])
audio = Path("call.wav").read_bytes()
result = speko.transcribe(
audio,
language="es-MX",
)
print(result.text, result.provider, result.confidence)
```
```python async
import asyncio
import os
from pathlib import Path
from spekoai import AsyncSpeko
async def main():
async with AsyncSpeko(api_key=os.environ["SPEKO_API_KEY"]) as speko:
audio = Path("call.wav").read_bytes()
result = await speko.transcribe(
audio,
language="es-MX",
)
print(result.text, result.provider, result.confidence)
asyncio.run(main())
```
## What the router does [#what-the-router-does]
Every proxy call takes a `RoutingIntent` (`language`, optional `optimize_for`). Speko scores every provider for that intent against its continuously-updated benchmark set, picks the top-ranked one, and fails over to the next-best if the primary errors. The returned result carries `provider`, `model`, `failover_count`, and `scores_run_id` so you can log what actually ran.
To restrict the pool — for compliance, cost caps, or pinning a provider while debugging — pass `constraints=PipelineConstraints(allowed_providers=AllowedProviders(stt=[...], llm=[...], tts=[...]))`. The router still ranks by score but only considers candidates on the allow-list.
## Pydantic models, camelCase wire [#pydantic-models-camelcase-wire]
All request/response models use camelCase aliases over the wire and snake\_case attributes in Python. You can pass plain `dict`s wherever a model is accepted — the SDK validates them.
```python
from spekoai import ChatMessage, RoutingIntent
# Typed form
speko.complete(
messages=[ChatMessage(role="user", content="Hi")],
intent=RoutingIntent(language="en"),
)
# Dict form — validated on the way in
speko.complete(
messages=[{"role": "user", "content": "Hi"}],
intent={"language": "en"},
)
```
## Reference [#reference]
`POST /v1/transcribe` — speech to text.
`POST /v1/synthesize` — text to speech.
`POST /v1/complete` — LLM completion.
Speech-to-speech WebSocket sessions (async only).
`GET /v1/usage` — billing summary.
Prepaid balance and append-only ledger.
`SpekoApiError`, `SpekoAuthError`, `SpekoRateLimitError`.
## Client options [#client-options]
| Option | Default | Description |
| ---------- | ----------------------- | --------------------------------------------------------------------------------------------- |
| `api_key` | — (required) | Your API key. Mint one at [platform.speko.dev/api-keys](https://platform.speko.dev/api-keys). |
| `base_url` | `https://api.speko.dev` | Override for local development or self-hosted deployments. Trailing slash is stripped. |
| `timeout` | `30.0` | Per-request timeout in seconds. |
Both clients are safe to share across concurrent calls. `Speko` is a sync context manager (`__enter__` / `__exit__` / `close()`); `AsyncSpeko` is an async context manager (`__aenter__` / `__aexit__` / `await close()`).
# connect_realtime (/sdk-python/realtime)
Speech-to-speech WebSocket sessions, async only.
Open a speech-to-speech (S2S) session. The server mints a short-lived WebSocket token, then proxies the client WS directly to the underlying provider (OpenAI Realtime, Gemini Live, xAI Grok Voice). The browser media transport is skipped entirely so time-to-first-audio stays under \~300 ms.
Realtime sessions are only available on `AsyncSpeko`. A synchronous WebSocket loop would block the event loop on every audio chunk, defeating the purpose of a low-latency S2S pipeline.
```python
import asyncio
from spekoai import AsyncSpeko, RealtimeConnectParams
async def main():
async with AsyncSpeko(api_key=os.environ["SPEKO_API_KEY"]) as speko:
session = await speko.connect_realtime(
RealtimeConnectParams(provider="openai", model="gpt-realtime"),
)
async with session:
await session.send_audio(pcm_chunk)
async for frame in session:
if frame["type"] == "audio":
play(frame["pcm"])
elif frame["type"] == "transcript":
print(frame["text"])
asyncio.run(main())
```
## Signature [#signature]
```python
await AsyncSpeko.connect_realtime(
params: RealtimeConnectParams,
) -> AsyncRealtimeSession
```
## `RealtimeConnectParams` [#realtimeconnectparams]
Provider-specific model id (e.g. `gpt-realtime`, `gemini-2.5-flash-native-audio`, `grok-voice-beta`).
Voice id override — interpreted per provider.
Tool definitions the assistant may call. Receive `tool_call` frames and respond with `send_tool_result`.
Free-form metadata attached to the session record.
Max session duration in seconds. Server-capped at 1800 (30 min).
## `AsyncRealtimeSession` [#asyncrealtimesession]
The returned session is both an async context manager and an async iterator.
### Properties [#properties]
| Property | Type | Description |
| ------------ | ----- | ----------------------------------- |
| `session_id` | `str` | Server-assigned session identifier. |
| `expires_at` | `str` | ISO-8601 expiry for the WS token. |
### Methods [#methods]
Ship a PCM16 chunk up to the model (binary WS frame).
Signal end-of-user-turn; the server flushes buffered audio upstream.
Cancel the assistant's current response mid-generation.
Return the result of a previously-issued tool call.
Close the socket. Safe to call multiple times; the context manager calls it for you on exit.
### Frame types [#frame-types]
Iterating the session yields dicts tagged by `type`:
| Frame `type` | Payload fields |
| ------------ | --------------------------------------------------------- |
| `audio` | `pcm: bytes`, `sample_rate: 24000` |
| `transcript` | `role: 'user' \| 'assistant'`, `text: str`, `final: bool` |
| `tool_call` | `call_id: str`, `name: str`, `arguments: str` (JSON) |
| `usage` | `input_audio_tokens: int`, `output_audio_tokens: int` |
| `error` | `code: str`, `message: str` |
| `close` | `reason: str` |
## Example — tool calling [#example--tool-calling]
```python
import json
from spekoai import AsyncSpeko, RealtimeConnectParams, RealtimeToolSpec
async with AsyncSpeko(api_key=os.environ["SPEKO_API_KEY"]) as speko:
session = await speko.connect_realtime(
RealtimeConnectParams(
provider="openai",
model="gpt-realtime",
tools=[
RealtimeToolSpec(
name="get_weather",
description="Current weather for a city.",
parameters={
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
),
],
),
)
async with session:
async for frame in session:
if frame["type"] == "tool_call" and frame["name"] == "get_weather":
args = json.loads(frame["arguments"])
result = fetch_weather(args["city"])
await session.send_tool_result(frame["call_id"], result)
```
# synthesize (/sdk-python/synthesize)
POST /v1/synthesize — text-to-speech with automatic provider routing.
Synthesize text into audio. The router picks the best TTS provider for your `(language, optimize_for)` and fails over automatically.
```python
speech = speko.synthesize(
"Hello world",
language="en",
)
Path("out.mp3").write_bytes(speech.audio)
```
## Signature [#signature]
```python sync
Speko.synthesize(
text: str,
*,
language: str,
optimize_for: OptimizeFor | None = None,
voice: str | None = None,
speed: float | None = None,
constraints: PipelineConstraints | dict | None = None,
) -> SynthesizeResult
```
```python async
await AsyncSpeko.synthesize(
text: str,
*,
language: str,
optimize_for: OptimizeFor | None = None,
voice: str | None = None,
speed: float | None = None,
constraints: PipelineConstraints | dict | None = None,
) -> SynthesizeResult
```
## Parameters [#parameters]
The text to synthesize. No client-side length limit — the upstream provider applies its own.
BCP-47 language tag.
Voice id override. Interpreted per provider (e.g. a Cartesia voice UUID, an ElevenLabs voice id). Omit to use each provider's default.
Speech speed multiplier. Providers vary in accepted ranges — `1.0` is always neutral.
Allow-list constraints.
## Returns — `SynthesizeResult` [#returns--synthesizeresult]
Raw audio bytes. Format depends on the chosen provider — always branch on `content_type`.
MIME type. ElevenLabs returns `audio/mpeg`. Cartesia returns `audio/pcm;rate=24000`.
Providers tried before this one succeeded.
## Example — write to disk [#example--write-to-disk]
```python
from pathlib import Path
speech = speko.synthesize(
"Welcome to the clinic.",
language="en",
voice="sonic-english",
)
ext = "mp3" if "mpeg" in speech.content_type else "pcm"
Path(f"greeting.{ext}").write_bytes(speech.audio)
```
## Example — pin a PCM provider [#example--pin-a-pcm-provider]
```python
from spekoai import AllowedProviders, PipelineConstraints
speech = speko.synthesize(
"Hello",
language="en",
constraints=PipelineConstraints(
allowed_providers=AllowedProviders(tts=["cartesia"]),
),
)
# speech.content_type == "audio/pcm;rate=24000"
```
Downstream consumers that only handle PCM (e.g. older LiveKit pipelines) should pin a PCM provider via `constraints` — or branch on `content_type` before decoding. MP3 from ElevenLabs will otherwise hit your decoder unexpectedly.
# transcribe (/sdk-python/transcribe)
POST /v1/transcribe — speech-to-text with automatic provider routing.
Transcribe an audio payload. The router picks the best STT provider for your `(language, optimize_for)` and fails over automatically.
```python
result = speko.transcribe(
audio_bytes,
language="es-MX",
)
print(result.text, result.provider, result.confidence)
```
## Signature [#signature]
```python sync
Speko.transcribe(
audio: bytes,
*,
language: str,
optimize_for: OptimizeFor | None = None,
content_type: str = "audio/wav",
constraints: PipelineConstraints | dict | None = None,
) -> TranscribeResult
```
```python async
await AsyncSpeko.transcribe(
audio: bytes,
*,
language: str,
optimize_for: OptimizeFor | None = None,
content_type: str = "audio/wav",
constraints: PipelineConstraints | dict | None = None,
) -> TranscribeResult
```
## Parameters [#parameters]
Raw audio bytes. Providers handle resampling and format conversion — any sample rate works. Wrap `bytearray` / `memoryview` with `bytes(...)` at the call site.
BCP-47 language tag, e.g. `"en"`, `"es-MX"`, `"ja-JP"`.
Preset that biases the weighted score. Server default is `balanced`.
MIME type for the request body.
Allow-list constraints. The router still ranks by score but only considers listed providers.
## Returns — `TranscribeResult` [#returns--transcriberesult]
Transcribed text.
Upstream provider that ran the request.
Provider-specific model identifier.
Model-reported confidence when available.
Number of providers tried before this one succeeded.
ID of the scoring run that selected this provider — useful for joining to benchmark data.
## Example — non-default MIME + allow-list [#example--non-default-mime--allow-list]
```python
from pathlib import Path
from spekoai import AllowedProviders, PipelineConstraints, Speko
speko = Speko(api_key="sk_live_...")
audio = Path("call.ogg").read_bytes()
result = speko.transcribe(
audio,
language="en",
optimize_for="accuracy",
content_type="audio/ogg",
constraints=PipelineConstraints(
allowed_providers=AllowedProviders(stt=["deepgram", "assemblyai"]),
),
)
```
## Wire format [#wire-format]
The audio ships as the raw HTTP body. The routing intent and constraints travel in two headers so no server-side re-parsing of the body is needed:
* `Content-Type`: value of `content_type` (default `audio/wav`).
* `X-Speko-Intent`: compact JSON `{"language", "optimizeFor"?}`.
* `X-Speko-Constraints`: compact JSON when `constraints` is set.
The response is a JSON `TranscribeResult`.
# usage (/sdk-python/usage)
GET /v1/usage — billing and usage reporting.
Usage summary for the current billing period. Exposed via `speko.usage`.
```python
summary = speko.usage.get()
print(summary.total_sessions, summary.total_minutes, summary.total_cost)
print(summary.balance_usd)
```
## Signature [#signature]
```python sync
Speko.usage.get(
*,
from_date: str | None = None,
to_date: str | None = None,
) -> UsageSummary
```
```python async
await AsyncSpeko.usage.get(
*,
from_date: str | None = None,
to_date: str | None = None,
) -> UsageSummary
```
Both parameters accept ISO-8601 dates or datetimes. Omit either to default to the current billing period.
## Returns — `UsageSummary` [#returns--usagesummary]
Total cost in USD over the range.
Per-provider rollup — see below.
Current organization balance in USD.
### `UsageByProvider` [#usagebyprovider]
Modality.
Billable unit (e.g. `minutes`, `characters`, `tokens`).
`BYOK` = customer key, no Speko margin. `MANAGED` = platform key, billed to the org.
Cost in USD for this row.
## Example — last 24 hours [#example--last-24-hours]
```python
from datetime import datetime, timedelta, timezone
now = datetime.now(timezone.utc)
yesterday = now - timedelta(hours=24)
summary = speko.usage.get(
from_date=yesterday.isoformat(),
to_date=now.isoformat(),
)
for row in summary.breakdown:
print(f"{row.type}\t{row.provider}\t{row.quantity}{row.metric}\t${row.cost}")
```
# List agent calls (/api-reference/agent-calls)
GET /v1/agents/{id}/calls — list recent calls for an agent.
## GET /v1/agents/{id}/calls
List calls for an agent
Returns recent calls whose session pipeline config references this agent. Results are newest first and support cursor pagination.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Agent id.
- `limit` (integer, in query, optional): Default: `50`. Min: 1. Max: 100.
- `cursor` (string (date-time), in query, optional): Use `next_cursor` from the previous page.
- `since` (string (date-time), in query, optional): Only include calls created at or after this timestamp.
### Responses
#### 200: Page of calls
Content type: `application/json`
- `calls` (array of object, required)
- `id` (string, required)
- `call_id` (string, required)
- `resource_uri` (string, required)
- `agent_id` (string, required)
- `status` (string, required)
- `kind` (string, required)
- `room_name` (string | null, optional)
- `language` (string, optional)
- `created_at` (string (date-time), required)
- `ended_at` (string (date-time) | null, optional)
- `duration_seconds` (integer | null, optional)
- `recording_status` (string | null, optional)
- `entries` (array of object, required)
- `id` (string, required)
- `call_id` (string, required)
- `resource_uri` (string, required)
- `agent_id` (string, required)
- `status` (string, required)
- `kind` (string, required)
- `room_name` (string | null, optional)
- `language` (string, optional)
- `created_at` (string (date-time), required)
- `ended_at` (string (date-time) | null, optional)
- `duration_seconds` (integer | null, optional)
- `recording_status` (string | null, optional)
- `next_cursor` (string (date-time) | null, required)
Example:
```json
{
"calls": [
{
"id": "string",
"call_id": "string",
"resource_uri": "string",
"agent_id": "string",
"status": "string",
"kind": "string",
"room_name": "string",
"language": "string",
"created_at": "2026-01-01T00:00:00Z",
"ended_at": "2026-01-01T00:00:00Z",
"duration_seconds": 0,
"recording_status": "string"
}
],
"entries": [
{
"id": "string",
"call_id": "string",
"resource_uri": "string",
"agent_id": "string",
"status": "string",
"kind": "string",
"room_name": "string",
"language": "string",
"created_at": "2026-01-01T00:00:00Z",
"ended_at": "2026-01-01T00:00:00Z",
"duration_seconds": 0,
"recording_status": "string"
}
],
"next_cursor": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 404: No agent with this id exists in the authenticated organization.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "not found",
"code": "NOT_FOUND"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/agents/{id}/calls' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
# Create agent (/api-reference/agents-create)
Persists a reusable voice persona — `systemPrompt`, optional `voice`, routing `intent`, and optional `llmOptions` — keyed by `name` within the authenticated organization. Pass the returned `id` as `agentId` on `POST /v1/sessions` to seed a session from this agent.
## POST /v1/agents
Create an agent
Persists a reusable voice persona — `systemPrompt`, optional `voice`, routing `intent`, and optional `llmOptions` — keyed by `name` within the authenticated organization. Pass the returned `id` as `agentId` on `POST /v1/sessions` to seed a session from this agent.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `name` (string, required): Human-readable name. Must be unique within the organization — duplicates return 409.
- `systemPrompt` (string, required): Initial agent instructions used as the session-level `systemPrompt` default.
- `voice` (string, optional): Default TTS voice id. Omit to let the router pick per provider.
- `intent` (object, required): Routing defaults stored on an agent row. Hydrated into a session's `RoutingIntent` when the session is created with `agentId` and no inline `intent`. Note: agents accept a smaller `optimizeFor` enum than the per-session `RoutingIntent`.
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`.
- `optimizeFor` (string, optional): One of: `"latency"`, `"quality"`, `"cost"`.
- `llmOptions` (object, optional): Optional LLM tuning defaults stored on an agent. Strict object — only these three keys are accepted.
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `model` (string, optional): Pin a specific LLM model id. When omitted, the router picks per intent.
- `stackPreferences` (object, optional): Per-agent stack preferences. Empty / missing layers leave the router unconstrained for that layer; failover stays active within the allowed set.
- `allowedProviders` (object, optional): Per-pipeline-layer provider allowlists (e.g. STT must be Deepgram or AssemblyAI). Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional): Allowed STT provider entries. Empty / absent = no constraint.
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional): Allowed LLM provider entries. Empty / absent = no constraint.
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional): Allowed TTS provider entries. Empty / absent = no constraint.
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
Example body:
```json
{
"name": "string",
"systemPrompt": "string",
"voice": "string",
"intent": {
"language": "en-US",
"optimizeFor": "latency"
},
"llmOptions": {
"temperature": 0,
"maxTokens": 0,
"model": "string"
},
"stackPreferences": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
]
}
}
}
```
### Responses
#### 201: Agent created
Content type: `application/json`
- `id` (string, required): Agent id, prefixed `agent_`.
- `organizationId` (string, required): Owning organization. Always equal to the authenticated org — agents are never visible across orgs.
- `name` (string, required): Human-readable name. Unique within an organization.
- `systemPrompt` (string, required): Initial agent instructions. Hydrated as the session's `systemPrompt` default when no per-call value is supplied.
- `voice` (string | null, required): Default TTS voice id. `null` when unset — the router picks a sane default per provider.
- `intent` (object, required): Routing defaults stored on an agent row. Hydrated into a session's `RoutingIntent` when the session is created with `agentId` and no inline `intent`. Note: agents accept a smaller `optimizeFor` enum than the per-session `RoutingIntent`.
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`.
- `optimizeFor` (string, optional): One of: `"latency"`, `"quality"`, `"cost"`.
- `llmOptions` (object | null, required): Optional LLM tuning defaults. `null` when unset.
- Variant (object):
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `model` (string, optional): Pin a specific LLM model id. When omitted, the router picks per intent.
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
- `stackPreferences` (object | null, required): Per-agent stack preferences. `null` when the agent has no preferences set.
- Variant (object):
- `allowedProviders` (object, optional): Per-pipeline-layer provider allowlists (e.g. STT must be Deepgram or AssemblyAI). Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional): Allowed STT provider entries. Empty / absent = no constraint.
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional): Allowed LLM provider entries. Empty / absent = no constraint.
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional): Allowed TTS provider entries. Empty / absent = no constraint.
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
Example:
```json
{
"id": "agent_01HW...",
"organizationId": "string",
"name": "string",
"systemPrompt": "string",
"voice": "string",
"intent": {
"language": "en-US",
"optimizeFor": "latency"
},
"llmOptions": {
"temperature": 0,
"maxTokens": 0,
"model": "string"
},
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z",
"stackPreferences": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
]
}
}
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 409: Another agent in the same organization already uses this name.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Agent with this name already exists",
"code": "AGENT_NAME_CONFLICT"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/agents' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"name":"string","systemPrompt":"string","voice":"string","intent":{"language":"en-US","optimizeFor":"latency"},"llmOptions":{"temperature":0,"maxTokens":0,"model":"string"},"stackPreferences":{"allowedProviders":{"stt":["deepgram"],"llm":["openai"],"tts":["elevenlabs"]}}}'
```
# Delete agent (/api-reference/agents-delete)
Removes the agent and its registered tools. The organization's only remaining agent cannot be deleted and returns 409.
## DELETE /v1/agents/{id}
Delete an agent
Removes the agent and its registered tools. The organization's only remaining agent cannot be deleted and returns 409.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Agent id, prefixed `agent_`.
### Responses
#### 200: Agent deleted
Content type: `application/json`
- `deleted` (boolean, required): One of: `true`.
Example:
```json
{
"deleted": true
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 404: No agent with this id exists in the authenticated organization.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "not found",
"code": "NOT_FOUND"
}
```
#### 409: The organization's only remaining agent cannot be deleted.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Cannot delete the only agent in this organization",
"code": "LAST_AGENT"
}
```
### Example request
```bash
curl -X DELETE 'https://api.speko.dev/v1/agents/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
# Get agent (/api-reference/agents-get)
Fetch a single agent by id. Scoped to the authenticated organization — agents owned by another org always return 404.
## GET /v1/agents/{id}
Get an agent
Fetch a single agent by id. Scoped to the authenticated organization — agents owned by another org always return 404.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Agent id, prefixed `agent_`.
### Responses
#### 200: Agent
Content type: `application/json`
- `id` (string, required): Agent id, prefixed `agent_`.
- `organizationId` (string, required): Owning organization. Always equal to the authenticated org — agents are never visible across orgs.
- `name` (string, required): Human-readable name. Unique within an organization.
- `systemPrompt` (string, required): Initial agent instructions. Hydrated as the session's `systemPrompt` default when no per-call value is supplied.
- `voice` (string | null, required): Default TTS voice id. `null` when unset — the router picks a sane default per provider.
- `intent` (object, required): Routing defaults stored on an agent row. Hydrated into a session's `RoutingIntent` when the session is created with `agentId` and no inline `intent`. Note: agents accept a smaller `optimizeFor` enum than the per-session `RoutingIntent`.
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`.
- `optimizeFor` (string, optional): One of: `"latency"`, `"quality"`, `"cost"`.
- `llmOptions` (object | null, required): Optional LLM tuning defaults. `null` when unset.
- Variant (object):
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `model` (string, optional): Pin a specific LLM model id. When omitted, the router picks per intent.
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
- `stackPreferences` (object | null, required): Per-agent stack preferences. `null` when the agent has no preferences set.
- Variant (object):
- `allowedProviders` (object, optional): Per-pipeline-layer provider allowlists (e.g. STT must be Deepgram or AssemblyAI). Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional): Allowed STT provider entries. Empty / absent = no constraint.
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional): Allowed LLM provider entries. Empty / absent = no constraint.
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional): Allowed TTS provider entries. Empty / absent = no constraint.
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
Example:
```json
{
"id": "agent_01HW...",
"organizationId": "string",
"name": "string",
"systemPrompt": "string",
"voice": "string",
"intent": {
"language": "en-US",
"optimizeFor": "latency"
},
"llmOptions": {
"temperature": 0,
"maxTokens": 0,
"model": "string"
},
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z",
"stackPreferences": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
]
}
}
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 404: No agent with this id exists in the authenticated organization.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "not found",
"code": "NOT_FOUND"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/agents/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
# List agents (/api-reference/agents-list)
Returns every agent belonging to the authenticated organization. Agents are scoped per-org — IDs from another org will never appear here.
## GET /v1/agents
List agents
Returns every agent belonging to the authenticated organization. Agents are scoped per-org — IDs from another org will never appear here.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Responses
#### 200: Agents owned by the authenticated organization.
Content type: `application/json`
- Items (object):
- `id` (string, required): Agent id, prefixed `agent_`.
- `organizationId` (string, required): Owning organization. Always equal to the authenticated org — agents are never visible across orgs.
- `name` (string, required): Human-readable name. Unique within an organization.
- `systemPrompt` (string, required): Initial agent instructions. Hydrated as the session's `systemPrompt` default when no per-call value is supplied.
- `voice` (string | null, required): Default TTS voice id. `null` when unset — the router picks a sane default per provider.
- `intent` (object, required): Routing defaults stored on an agent row. Hydrated into a session's `RoutingIntent` when the session is created with `agentId` and no inline `intent`. Note: agents accept a smaller `optimizeFor` enum than the per-session `RoutingIntent`.
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`.
- `optimizeFor` (string, optional): One of: `"latency"`, `"quality"`, `"cost"`.
- `llmOptions` (object | null, required): Optional LLM tuning defaults. `null` when unset.
- Variant (object):
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `model` (string, optional): Pin a specific LLM model id. When omitted, the router picks per intent.
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
- `stackPreferences` (object | null, required): Per-agent stack preferences. `null` when the agent has no preferences set.
- Variant (object):
- `allowedProviders` (object, optional): Per-pipeline-layer provider allowlists (e.g. STT must be Deepgram or AssemblyAI). Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional): Allowed STT provider entries. Empty / absent = no constraint.
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional): Allowed LLM provider entries. Empty / absent = no constraint.
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional): Allowed TTS provider entries. Empty / absent = no constraint.
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
Example:
```json
[
{
"id": "agent_01HW...",
"organizationId": "string",
"name": "string",
"systemPrompt": "string",
"voice": "string",
"intent": {
"language": "en-US",
"optimizeFor": "latency"
},
"llmOptions": {
"temperature": 0,
"maxTokens": 0,
"model": "string"
},
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z",
"stackPreferences": {
"allowedProviders": {
"stt": [
null
],
"llm": [
null
],
"tts": [
null
]
}
}
}
]
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/agents' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
# Update agent (/api-reference/agents-update)
Partial update — every field on the body is optional and only supplied keys are written. Updating `name` to one already used by another agent in the org returns 409.
## PATCH /v1/agents/{id}
Update an agent
Partial update — every field on the body is optional and only supplied keys are written. Updating `name` to one already used by another agent in the org returns 409.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Agent id, prefixed `agent_`.
### Request body
Required.
Content types: `application/json`
- `name` (string, optional): Human-readable name. Must remain unique within the organization.
- `systemPrompt` (string, optional)
- `voice` (string, optional)
- `intent` (object, optional): Routing defaults stored on an agent row. Hydrated into a session's `RoutingIntent` when the session is created with `agentId` and no inline `intent`. Note: agents accept a smaller `optimizeFor` enum than the per-session `RoutingIntent`.
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`.
- `optimizeFor` (string, optional): One of: `"latency"`, `"quality"`, `"cost"`.
- `llmOptions` (object, optional): Optional LLM tuning defaults stored on an agent. Strict object — only these three keys are accepted.
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `model` (string, optional): Pin a specific LLM model id. When omitted, the router picks per intent.
- `stackPreferences` (object, optional): Per-agent stack preferences. Empty / missing layers leave the router unconstrained for that layer; failover stays active within the allowed set.
- `allowedProviders` (object, optional): Per-pipeline-layer provider allowlists (e.g. STT must be Deepgram or AssemblyAI). Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional): Allowed STT provider entries. Empty / absent = no constraint.
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional): Allowed LLM provider entries. Empty / absent = no constraint.
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional): Allowed TTS provider entries. Empty / absent = no constraint.
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
Example body:
```json
{
"name": "string",
"systemPrompt": "string",
"voice": "string",
"intent": {
"language": "en-US",
"optimizeFor": "latency"
},
"llmOptions": {
"temperature": 0,
"maxTokens": 0,
"model": "string"
},
"stackPreferences": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
]
}
}
}
```
### Responses
#### 200: Agent updated
Content type: `application/json`
- `id` (string, required): Agent id, prefixed `agent_`.
- `organizationId` (string, required): Owning organization. Always equal to the authenticated org — agents are never visible across orgs.
- `name` (string, required): Human-readable name. Unique within an organization.
- `systemPrompt` (string, required): Initial agent instructions. Hydrated as the session's `systemPrompt` default when no per-call value is supplied.
- `voice` (string | null, required): Default TTS voice id. `null` when unset — the router picks a sane default per provider.
- `intent` (object, required): Routing defaults stored on an agent row. Hydrated into a session's `RoutingIntent` when the session is created with `agentId` and no inline `intent`. Note: agents accept a smaller `optimizeFor` enum than the per-session `RoutingIntent`.
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`.
- `optimizeFor` (string, optional): One of: `"latency"`, `"quality"`, `"cost"`.
- `llmOptions` (object | null, required): Optional LLM tuning defaults. `null` when unset.
- Variant (object):
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `model` (string, optional): Pin a specific LLM model id. When omitted, the router picks per intent.
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
- `stackPreferences` (object | null, required): Per-agent stack preferences. `null` when the agent has no preferences set.
- Variant (object):
- `allowedProviders` (object, optional): Per-pipeline-layer provider allowlists (e.g. STT must be Deepgram or AssemblyAI). Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional): Allowed STT provider entries. Empty / absent = no constraint.
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional): Allowed LLM provider entries. Empty / absent = no constraint.
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional): Allowed TTS provider entries. Empty / absent = no constraint.
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
Example:
```json
{
"id": "agent_01HW...",
"organizationId": "string",
"name": "string",
"systemPrompt": "string",
"voice": "string",
"intent": {
"language": "en-US",
"optimizeFor": "latency"
},
"llmOptions": {
"temperature": 0,
"maxTokens": 0,
"model": "string"
},
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z",
"stackPreferences": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
]
}
}
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 404: No agent with this id exists in the authenticated organization.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "not found",
"code": "NOT_FOUND"
}
```
#### 409: Another agent in the same organization already uses this name.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Agent with this name already exists",
"code": "AGENT_NAME_CONFLICT"
}
```
### Example request
```bash
curl -X PATCH 'https://api.speko.dev/v1/agents/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"name":"string","systemPrompt":"string","voice":"string","intent":{"language":"en-US","optimizeFor":"latency"},"llmOptions":{"temperature":0,"maxTokens":0,"model":"string"},"stackPreferences":{"allowedProviders":{"stt":["deepgram"],"llm":["openai"],"tts":["elevenlabs"]}}}'
```
# Callbacks (/api-reference/callbacks)
List, inspect, cancel, and dispatch scheduled callbacks created from call analysis.
## GET /v1/callbacks
List scheduled callbacks
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `status` (string, in query, optional): One of: `"scheduled"`, `"dispatching"`, `"dispatched"`, `"cancelled"`, `"failed"`.
- `source_session_id` (string, in query, optional)
- `limit` (integer, in query, optional): Default: `25`. Min: 1. Max: 100.
### Responses
#### 200: Scheduled callbacks
Content type: `application/json`
- `callbacks` (array of object, required)
- `id` (string, required)
- `organization_id` (string, required)
- `source_session_id` (string | null, optional)
- `created_session_id` (string | null, optional)
- `agent_id` (string | null, optional)
- `phone_number_id` (string | null, optional)
- `to_number` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `from_number` (string | null, optional)
- `scheduled_at` (string (date-time), required)
- `status` (string, required): One of: `"scheduled"`, `"dispatching"`, `"dispatched"`, `"cancelled"`, `"failed"`.
- `reason` (string | null, optional)
- `instructions` (string | null, optional)
- `summary` (string | null, optional)
- `pipeline_config` (object, required)
- `metadata` (object, required)
- `failure_cause` (string | null, optional)
- `attempted_at` (string (date-time) | null, optional)
- `dispatched_at` (string (date-time) | null, optional)
- `cancelled_at` (string (date-time) | null, optional)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
Example:
```json
{
"callbacks": [
{
"id": "string",
"organization_id": "string",
"source_session_id": "string",
"created_session_id": "string",
"agent_id": "string",
"phone_number_id": "string",
"to_number": "+12015551234",
"from_number": "string",
"scheduled_at": "2026-01-01T00:00:00Z",
"status": "scheduled",
"reason": "string",
"instructions": "string",
"summary": "string",
"pipeline_config": {},
"metadata": {},
"failure_cause": "string",
"attempted_at": "2026-01-01T00:00:00Z",
"dispatched_at": "2026-01-01T00:00:00Z",
"cancelled_at": "2026-01-01T00:00:00Z",
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z"
}
]
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/callbacks' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## GET /v1/callbacks/{id}
Get scheduled callback
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Scheduled callback
Content type: `application/json`
- `id` (string, required)
- `organization_id` (string, required)
- `source_session_id` (string | null, optional)
- `created_session_id` (string | null, optional)
- `agent_id` (string | null, optional)
- `phone_number_id` (string | null, optional)
- `to_number` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `from_number` (string | null, optional)
- `scheduled_at` (string (date-time), required)
- `status` (string, required): One of: `"scheduled"`, `"dispatching"`, `"dispatched"`, `"cancelled"`, `"failed"`.
- `reason` (string | null, optional)
- `instructions` (string | null, optional)
- `summary` (string | null, optional)
- `pipeline_config` (object, required)
- `metadata` (object, required)
- `failure_cause` (string | null, optional)
- `attempted_at` (string (date-time) | null, optional)
- `dispatched_at` (string (date-time) | null, optional)
- `cancelled_at` (string (date-time) | null, optional)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
Example:
```json
{
"id": "string",
"organization_id": "string",
"source_session_id": "string",
"created_session_id": "string",
"agent_id": "string",
"phone_number_id": "string",
"to_number": "+12015551234",
"from_number": "string",
"scheduled_at": "2026-01-01T00:00:00Z",
"status": "scheduled",
"reason": "string",
"instructions": "string",
"summary": "string",
"pipeline_config": {},
"metadata": {},
"failure_cause": "string",
"attempted_at": "2026-01-01T00:00:00Z",
"dispatched_at": "2026-01-01T00:00:00Z",
"cancelled_at": "2026-01-01T00:00:00Z",
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/callbacks/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## POST /v1/callbacks/{id}/cancel
Cancel scheduled callback
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Request body
Optional.
Content types: `application/json`
- `reason` (string, optional)
Example body:
```json
{
"reason": "string"
}
```
### Responses
#### 200: Scheduled callback
Content type: `application/json`
- `id` (string, required)
- `organization_id` (string, required)
- `source_session_id` (string | null, optional)
- `created_session_id` (string | null, optional)
- `agent_id` (string | null, optional)
- `phone_number_id` (string | null, optional)
- `to_number` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `from_number` (string | null, optional)
- `scheduled_at` (string (date-time), required)
- `status` (string, required): One of: `"scheduled"`, `"dispatching"`, `"dispatched"`, `"cancelled"`, `"failed"`.
- `reason` (string | null, optional)
- `instructions` (string | null, optional)
- `summary` (string | null, optional)
- `pipeline_config` (object, required)
- `metadata` (object, required)
- `failure_cause` (string | null, optional)
- `attempted_at` (string (date-time) | null, optional)
- `dispatched_at` (string (date-time) | null, optional)
- `cancelled_at` (string (date-time) | null, optional)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
Example:
```json
{
"id": "string",
"organization_id": "string",
"source_session_id": "string",
"created_session_id": "string",
"agent_id": "string",
"phone_number_id": "string",
"to_number": "+12015551234",
"from_number": "string",
"scheduled_at": "2026-01-01T00:00:00Z",
"status": "scheduled",
"reason": "string",
"instructions": "string",
"summary": "string",
"pipeline_config": {},
"metadata": {},
"failure_cause": "string",
"attempted_at": "2026-01-01T00:00:00Z",
"dispatched_at": "2026-01-01T00:00:00Z",
"cancelled_at": "2026-01-01T00:00:00Z",
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/callbacks/{id}/cancel' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"reason":"string"}'
```
## POST /v1/callbacks/{id}/dispatch
Dispatch scheduled callback now
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Scheduled callback
Content type: `application/json`
- `id` (string, required)
- `organization_id` (string, required)
- `source_session_id` (string | null, optional)
- `created_session_id` (string | null, optional)
- `agent_id` (string | null, optional)
- `phone_number_id` (string | null, optional)
- `to_number` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `from_number` (string | null, optional)
- `scheduled_at` (string (date-time), required)
- `status` (string, required): One of: `"scheduled"`, `"dispatching"`, `"dispatched"`, `"cancelled"`, `"failed"`.
- `reason` (string | null, optional)
- `instructions` (string | null, optional)
- `summary` (string | null, optional)
- `pipeline_config` (object, required)
- `metadata` (object, required)
- `failure_cause` (string | null, optional)
- `attempted_at` (string (date-time) | null, optional)
- `dispatched_at` (string (date-time) | null, optional)
- `cancelled_at` (string (date-time) | null, optional)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
Example:
```json
{
"id": "string",
"organization_id": "string",
"source_session_id": "string",
"created_session_id": "string",
"agent_id": "string",
"phone_number_id": "string",
"to_number": "+12015551234",
"from_number": "string",
"scheduled_at": "2026-01-01T00:00:00Z",
"status": "scheduled",
"reason": "string",
"instructions": "string",
"summary": "string",
"pipeline_config": {},
"metadata": {},
"failure_cause": "string",
"attempted_at": "2026-01-01T00:00:00Z",
"dispatched_at": "2026-01-01T00:00:00Z",
"cancelled_at": "2026-01-01T00:00:00Z",
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/callbacks/{id}/dispatch' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
# Calls (/api-reference/calls)
Inspect call detail, events, reports, recordings, and live call transfers.
## GET /v1/calls/{id}
Get call detail
Returns the call session, transcript, report if available, transfer attempts, and recording metadata.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Call detail
Content type: `application/json`
- `id` (string, required)
- `call_id` (string, required)
- `resource_uri` (string, required)
- `agent_id` (string | null, optional)
- `status` (string, required)
- `kind` (string, required)
- `room_name` (string | null, optional)
- `language` (string, optional)
- `pipeline_config` (object, optional)
- `metadata` (object, optional)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), optional)
- `ended_at` (string (date-time) | null, optional)
- `duration_seconds` (integer | null, optional)
- `recording_status` (string | null, optional)
- `recording_duration_ms` (integer | null, optional)
- `recording_resource_uri` (string, optional)
- `report` (object, optional)
- `session_id` (string, required)
- `organization_id` (string, optional)
- `summary` (string, required)
- `outcome` (string, required)
- `structured_data` (object, optional)
- `transcript` (object, required)
- `entries` (array of object, optional)
- `id` (string, required)
- `index` (integer, required)
- `source` (string, required): One of: `"user"`, `"agent"`, `"system"`.
- `text` (string, required)
- `started_at` (string (date-time), required)
- `ended_at` (string (date-time) | null, optional)
- `provider` (string | null, optional)
- `model` (string | null, optional)
- `metadata` (object, required)
- `cost_micro_usd` (string, required)
- `cost_breakdown` (array of object, optional)
- `metadata` (object, optional)
- `scheduled_callback` (object, optional)
- `analysis_status` (string, optional): One of: `"heuristic"`, `"completed"`, `"failed"`.
- `post_call_webhook_status` (string, optional)
- `created_at` (string (date-time), optional)
- `updated_at` (string (date-time), optional)
- `transfers` (array of object, optional)
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
- `transcript` (object, required)
- `entries` (array of object, optional)
- `id` (string, required)
- `index` (integer, required)
- `source` (string, required): One of: `"user"`, `"agent"`, `"system"`.
- `text` (string, required)
- `started_at` (string (date-time), required)
- `ended_at` (string (date-time) | null, optional)
- `provider` (string | null, optional)
- `model` (string | null, optional)
- `metadata` (object, required)
- `span_tree` (object, optional)
Example:
```json
{
"id": "string",
"call_id": "string",
"resource_uri": "string",
"agent_id": "string",
"status": "string",
"kind": "string",
"room_name": "string",
"language": "string",
"pipeline_config": {},
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"ended_at": "2026-01-01T00:00:00Z",
"duration_seconds": 0,
"recording_status": "string",
"recording_duration_ms": 0,
"recording_resource_uri": "string",
"report": {
"session_id": "string",
"organization_id": "string",
"summary": "string",
"outcome": "string",
"structured_data": {},
"transcript": {
"entries": [
{
"id": "string",
"index": 0,
"source": "user",
"text": "string",
"started_at": "2026-01-01T00:00:00Z",
"ended_at": "2026-01-01T00:00:00Z",
"provider": "string",
"model": "string",
"metadata": {}
}
]
},
"cost_micro_usd": "string",
"cost_breakdown": [
{}
],
"metadata": {},
"scheduled_callback": {},
"analysis_status": "heuristic",
"post_call_webhook_status": "string",
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z"
},
"transfers": [
{
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
}
],
"transcript": {
"entries": [
{
"id": "string",
"index": 0,
"source": "user",
"text": "string",
"started_at": "2026-01-01T00:00:00Z",
"ended_at": "2026-01-01T00:00:00Z",
"provider": "string",
"model": "string",
"metadata": {}
}
]
},
"span_tree": {}
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/calls/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## GET /v1/calls/{id}/events
List call events
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Call events
Content type: `application/json`
- `events` (array of object, required)
- `id` (string, required)
- `session_id` (string | null, optional)
- `organization_id` (string, required)
- `provider` (string, required)
- `event_type` (string, required)
- `status` (string | null, optional)
- `failure_cause` (string | null, optional)
- `sip_status_code` (integer | null, optional)
- `sip_status` (string | null, optional)
- `occurred_at` (string (date-time), required)
- `payload` (object, required)
- `created_at` (string (date-time), required)
Example:
```json
{
"events": [
{
"id": "string",
"session_id": "string",
"organization_id": "string",
"provider": "string",
"event_type": "string",
"status": "string",
"failure_cause": "string",
"sip_status_code": 0,
"sip_status": "string",
"occurred_at": "2026-01-01T00:00:00Z",
"payload": {},
"created_at": "2026-01-01T00:00:00Z"
}
]
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/calls/{id}/events' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## GET /v1/calls/{id}/report
Get call report
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Call report
Content type: `application/json`
- `session_id` (string, required)
- `organization_id` (string, optional)
- `summary` (string, required)
- `outcome` (string, required)
- `structured_data` (object, optional)
- `transcript` (object, required)
- `entries` (array of object, optional)
- `id` (string, required)
- `index` (integer, required)
- `source` (string, required): One of: `"user"`, `"agent"`, `"system"`.
- `text` (string, required)
- `started_at` (string (date-time), required)
- `ended_at` (string (date-time) | null, optional)
- `provider` (string | null, optional)
- `model` (string | null, optional)
- `metadata` (object, required)
- `cost_micro_usd` (string, required)
- `cost_breakdown` (array of object, optional)
- `metadata` (object, optional)
- `scheduled_callback` (object, optional)
- `analysis_status` (string, optional): One of: `"heuristic"`, `"completed"`, `"failed"`.
- `post_call_webhook_status` (string, optional)
- `created_at` (string (date-time), optional)
- `updated_at` (string (date-time), optional)
Example:
```json
{
"session_id": "string",
"organization_id": "string",
"summary": "string",
"outcome": "string",
"structured_data": {},
"transcript": {
"entries": [
{
"id": "string",
"index": 0,
"source": "user",
"text": "string",
"started_at": "2026-01-01T00:00:00Z",
"ended_at": "2026-01-01T00:00:00Z",
"provider": "string",
"model": "string",
"metadata": {}
}
]
},
"cost_micro_usd": "string",
"cost_breakdown": [
{}
],
"metadata": {},
"scheduled_callback": {},
"analysis_status": "heuristic",
"post_call_webhook_status": "string",
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/calls/{id}/report' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## POST /v1/calls/{id}/report/finalize
Finalize call report
Runs or re-runs analysis for a call and optionally retries post-call webhook delivery.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Request body
Optional.
Content types: `application/json`
- `forceAnalysis` (boolean, optional)
- `retryWebhook` (boolean, optional)
Example body:
```json
{
"forceAnalysis": true,
"retryWebhook": true
}
```
### Responses
#### 200: Finalization result
Content type: `application/json`
- `session_id` (string, required)
- `summary` (string, required)
- `outcome` (string, required)
- `cost_micro_usd` (string, required)
- `webhook` (object, required)
Example:
```json
{
"session_id": "string",
"summary": "string",
"outcome": "string",
"cost_micro_usd": "string",
"webhook": {}
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/calls/{id}/report/finalize' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"forceAnalysis":true,"retryWebhook":true}'
```
## GET /v1/calls/{id}/recording
Get call recording URL
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Signed recording URL
Content type: `application/json`
- `url` (string (uri), required)
Example:
```json
{
"url": "string"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/calls/{id}/recording' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## POST /v1/calls/{id}/transfers/blind
Start blind transfer
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Request body
Required.
Content types: `application/json`
- `to` (string, required)
- `participantIdentity` (string, optional)
- `playDialtone` (boolean, optional)
- `ringingTimeout` (integer, optional): Min: 1. Max: 120.
- `headers` (object, optional)
Example body:
```json
{
"to": "string",
"participantIdentity": "string",
"playDialtone": true,
"ringingTimeout": 0,
"headers": {}
}
```
### Responses
#### 201: Transfer
Content type: `application/json`
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
Example:
```json
{
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/calls/{id}/transfers/blind' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"to":"string","participantIdentity":"string","playDialtone":true,"ringingTimeout":0,"headers":{}}'
```
## POST /v1/calls/{id}/transfers/warm
Start warm transfer
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Request body
Required.
Content types: `application/json`
- `to` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
- `destinations` (array of object, optional)
- `from` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
- `participantIdentity` (string, optional)
- `outboundTrunkId` (string, optional)
- `screeningPrompt` (string, optional)
- `summary` (string, optional)
- `ringingTimeout` (integer, optional): Min: 1. Max: 120.
- `waitUntilAnswered` (boolean, optional)
- `fallback` (object, optional)
- `voicemailDetection` (object, optional)
- `metadata` (object, optional)
Example body:
```json
{
"to": "+12015551234",
"destinations": [
{}
],
"from": "+12015551234",
"participantIdentity": "string",
"outboundTrunkId": "string",
"screeningPrompt": "string",
"summary": "string",
"ringingTimeout": 0,
"waitUntilAnswered": true,
"fallback": {},
"voicemailDetection": {},
"metadata": {}
}
```
### Responses
#### 201: Transfer
Content type: `application/json`
- `routing_attempts` (array of object, optional)
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
- `next_transfer` (object, optional)
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
- `fallback` (object, optional)
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
Example:
```json
{
"routing_attempts": [
{
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
}
],
"next_transfer": {
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
},
"fallback": {},
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/calls/{id}/transfers/warm' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"to":"+12015551234","destinations":[{}],"from":"+12015551234","participantIdentity":"string","outboundTrunkId":"string","screeningPrompt":"string","summary":"string","ringingTimeout":0,"waitUntilAnswered":true,"fallback":{},"voicemailDetection":{},"metadata":{}}'
```
## POST /v1/calls/{id}/transfers/{transferId}/complete
Complete warm transfer
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
- `transferId` (string, in path, required)
### Request body
Optional.
Content types: `application/json`
- `recipientParticipantIdentity` (string, optional)
- `summary` (string, optional)
Example body:
```json
{
"recipientParticipantIdentity": "string",
"summary": "string"
}
```
### Responses
#### 200: Transfer
Content type: `application/json`
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
Example:
```json
{
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/calls/{id}/transfers/{transferId}/complete' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"recipientParticipantIdentity":"string","summary":"string"}'
```
## POST /v1/calls/{id}/transfers/{transferId}/cancel
Cancel warm transfer
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
- `transferId` (string, in path, required)
### Request body
Optional.
Content types: `application/json`
- `reason` (string, optional)
- `summary` (string, optional)
- `tryNext` (boolean, optional)
- `voicemailDetected` (boolean, optional)
Example body:
```json
{
"reason": "string",
"summary": "string",
"tryNext": true,
"voicemailDetected": true
}
```
### Responses
#### 200: Transfer response
Content type: `application/json`
- `routing_attempts` (array of object, optional)
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
- `next_transfer` (object, optional)
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
- `fallback` (object, optional)
- `id` (string, required)
- `session_id` (string, required)
- `organization_id` (string, required)
- `kind` (string, required): One of: `"blind"`, `"warm"`.
- `status` (string, required): One of: `"requested"`, `"screening"`, `"bridging"`, `"completed"`, `"failed"`, `"cancelled"`.
- `transfer_to` (string, required)
- `from_room_name` (string | null, optional)
- `consultation_room_name` (string | null, optional)
- `caller_participant_identity` (string | null, optional)
- `recipient_participant_identity` (string | null, optional)
- `outbound_trunk_id` (string | null, optional)
- `screening_prompt` (string | null, optional)
- `summary` (string | null, optional)
- `failure_cause` (string | null, optional)
- `metadata` (object, required)
- `created_at` (string (date-time), required)
- `updated_at` (string (date-time), required)
- `completed_at` (string (date-time) | null, optional)
Example:
```json
{
"routing_attempts": [
{
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
}
],
"next_transfer": {
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
},
"fallback": {},
"id": "string",
"session_id": "string",
"organization_id": "string",
"kind": "blind",
"status": "requested",
"transfer_to": "string",
"from_room_name": "string",
"consultation_room_name": "string",
"caller_participant_identity": "string",
"recipient_participant_identity": "string",
"outbound_trunk_id": "string",
"screening_prompt": "string",
"summary": "string",
"failure_cause": "string",
"metadata": {},
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-01T00:00:00Z",
"completed_at": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/calls/{id}/transfers/{transferId}/cancel' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"reason":"string","summary":"string","tryNext":true,"voicemailDetected":true}'
```
# Complete (/api-reference/complete)
Single-turn LLM call routed to the best provider for your intent, with automatic failover. Returns assistant text + token usage.
## POST /v1/complete
Generate an LLM completion
Single-turn LLM call routed to the best provider for your intent, with automatic failover before the first event is emitted. The response is a Server-Sent Events stream: `meta`, `delta`, `tool_call`, `server_tool_call`, `done`, and `error`.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `messages` (array of object, required)
- `role` (string, required): One of: `"system"`, `"user"`, `"assistant"`, `"tool"`.
- `content` (string, required)
- `toolCalls` (array of object, optional)
- `id` (string, required)
- `name` (string, required)
- `args` (string, required): JSON-encoded tool arguments.
- `toolCallId` (string, optional): Required on `role: tool`; pairs with a prior assistant tool call id.
- `isError` (boolean, optional): Marks a tool message as an error result.
- `intent` (object, required)
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`. Pattern: `^[a-z]{2}(-[A-Z]{2})?$`.
- `region` (string, optional): Region whose streaming-latency measurements should be used. `global` falls through to batch-mode rows when no per-region data matches. One of: `"global"`, `"us-east4"`, `"europe-west3"`, `"asia-southeast1"`. Default: `"global"`.
- `optimizeFor` (string, optional): One of: `"balanced"`, `"accuracy"`, `"latency"`, `"cost"`. Default: `"balanced"`.
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `systemPrompt` (string, optional): Forwarded via the canonical system-prompt slot. Don't *also* prepend the same content as a `system` message.
- `constraints` (object, optional)
- `allowedProviders` (object, optional): Restrict candidate pool per modality. Router still ranks by score. Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional)
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional)
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional)
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
- `s2s` (array of string, optional)
- Allowed S2S provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI realtime model) or `":"` (e.g. `"openai:gpt-realtime"` — only that model). Both forms can be mixed in the same array.
- `reasoningEffort` (string, optional): One of: `"none"`, `"minimal"`, `"low"`, `"medium"`, `"high"`, `"xhigh"`.
- `tools` (array of object, optional)
- `name` (string, required)
- `description` (string, required)
- `parameters` (object, required): JSON Schema parameters object.
- `executionMode` (string, optional): Where the tool runs. Omit for inline execution. One of: `"inline"`, `"webhook"`, `"builtin"`.
- `source` (object, optional): Webhook, builtin, or inline execution source config.
- `toolChoice` (string | object, optional)
- `parallelToolCalls` (boolean, optional)
- `maxToolHops` (integer, optional): Min: 1. Max: 16.
Example body:
```json
{
"messages": [
{
"role": "system",
"content": "string",
"toolCalls": [
{
"id": "string",
"name": "string",
"args": "string"
}
],
"toolCallId": "string",
"isError": true
}
],
"intent": {
"language": "en-US",
"region": "global",
"optimizeFor": "balanced"
},
"temperature": 0,
"maxTokens": 0,
"systemPrompt": "string",
"constraints": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
],
"s2s": [
"openai"
]
}
},
"reasoningEffort": "none",
"tools": [
{
"name": "string",
"description": "string",
"parameters": {},
"executionMode": "inline",
"source": {}
}
],
"toolChoice": "auto",
"parallelToolCalls": true,
"maxToolHops": 0
}
```
### Responses
#### 200: Completion event stream
Content type: `text/event-stream`
SSE events: `meta`, zero or more `delta` and tool events, final `done` containing a `CompleteResponse`, or `error` after streaming has started.
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 500: No LLM provider available for the intent, or every candidate failed.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "All providers failed",
"code": "COMPLETE_FAILED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/complete' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"messages":[{"role":"system","content":"string","toolCalls":[{"id":"string","name":"string","args":"string"}],"toolCallId":"string","isError":true}],"intent":{"language":"en-US","region":"global","optimizeFor":"balanced"},"temperature":0,"maxTokens":0,"systemPrompt":"string","constraints":{"allowedProviders":{"stt":["deepgram"],"llm":["openai"],"tts":["elevenlabs"],"s2s":["openai"]}},"reasoningEffort":"none","tools":[{"name":"string","description":"string","parameters":{},"executionMode":"inline","source":{}}],"toolChoice":"auto","parallelToolCalls":true,"maxToolHops":0}'
```
# Introduction (/api-reference/introduction)
Public REST API for the Speko voice gateway.
## Welcome [#welcome]
The Speko API exposes the voice gateway and agent control plane from `https://api.speko.dev`. Core endpoints include `sessions`, `sessions/phone`, `phone-numbers`, `calls`, `callbacks`, `transcribe`, `synthesize`, `complete`, `agents`, `knowledge-bases`, `usage`, `credits`, and provider configuration. Calls that carry a routing intent let Speko pick the highest-scoring provider and fail over server-side. Phone calls support inbound routing, outbound PSTN dialing, lifecycle webhooks, post-call reports, recordings, and blind or warm transfers. The full spec is below.
View the OpenAPI 3.1 spec file
## Servers [#servers]
| Environment | Base URL |
| ----------- | ----------------------- |
| Production | `https://api.speko.dev` |
## Authentication [#authentication]
All endpoints are authenticated with a bearer API key. Mint one at [API keys](https://platform.speko.dev/api-keys).
```http
Authorization: Bearer sk_live_...
```
The OpenAPI spec declares this scheme as `bearerAuth`:
```json
"security": [
{
"bearerAuth": []
}
]
```
# Phone numbers (/api-reference/phone-numbers)
Provision managed numbers, import SIP-trunk numbers, update routing, and manage phone-number business verification.
## GET /v1/phone-numbers
List phone numbers
Returns managed and customer SIP-trunk phone numbers registered to the authenticated organization.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Responses
#### 200: Organization phone numbers
Content type: `application/json`
- Items (object):
- `id` (string, required)
- `organizationId` (string, required)
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `source` (string, required): One of: `"managed"`, `"sip_trunk"`.
- `providerResourceId` (string | null, optional)
- `telnyxPhoneNumberId` (string | null, optional): Deprecated.
- `sipTrunkId` (string | null, optional): Deprecated.
- `sipConnectionInstallationId` (string | null, optional)
- `sipProviderName` (string | null, optional)
- `direction` (string, required): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object | null, optional)
- `agentId` (string | null, optional)
- `label` (string | null, optional)
- `sms10dlcProfileId` (string | null, optional)
- `smsCampaignId` (string | null, optional)
- `smsAssignmentStatus` (string | null, optional)
- `smsAssignmentUpdatedAt` (string (date-time) | null, optional)
- `setupStatus` (object, required)
- `status` (string, required): One of: `"ready"`, `"action_required"`, `"suspended"`.
- `inboundReady` (boolean, required)
- `outboundReady` (boolean, required)
- `agentReady` (boolean, required)
- `forwardingRequired` (boolean, required)
- `sipConnectionReady` (boolean, required)
- `issues` (array of string, required)
- `nextChargeAt` (string (date-time), optional)
- `lastChargedAt` (string (date-time) | null, optional)
- `suspendedAt` (string (date-time) | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
Example:
```json
[
{
"id": "string",
"organizationId": "string",
"e164": "+12015551234",
"source": "managed",
"providerResourceId": "string",
"telnyxPhoneNumberId": "string",
"sipTrunkId": "string",
"sipConnectionInstallationId": "string",
"sipProviderName": "string",
"direction": "inbound",
"dispatchMetadataTemplate": {},
"agentId": "string",
"label": "string",
"sms10dlcProfileId": "string",
"smsCampaignId": "string",
"smsAssignmentStatus": "string",
"smsAssignmentUpdatedAt": "2026-01-01T00:00:00Z",
"setupStatus": {
"status": "ready",
"inboundReady": true,
"outboundReady": true,
"agentReady": true,
"forwardingRequired": true,
"sipConnectionReady": true,
"issues": [
"string"
]
},
"nextChargeAt": "2026-01-01T00:00:00Z",
"lastChargedAt": "2026-01-01T00:00:00Z",
"suspendedAt": "2026-01-01T00:00:00Z",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
}
]
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/phone-numbers' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## POST /v1/phone-numbers
Buy a managed phone number
Orders a platform-managed phone number and registers it for inbound and/or outbound calling. Business verification and sufficient credits are required.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `direction` (string, optional): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object, optional)
- `label` (string, optional)
- `agentId` (string, optional)
Example body:
```json
{
"e164": "+12015551234",
"direction": "inbound",
"dispatchMetadataTemplate": {},
"label": "string",
"agentId": "string"
}
```
### Responses
#### 200: Phone number
Content type: `application/json`
- `id` (string, required)
- `organizationId` (string, required)
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `source` (string, required): One of: `"managed"`, `"sip_trunk"`.
- `providerResourceId` (string | null, optional)
- `telnyxPhoneNumberId` (string | null, optional): Deprecated.
- `sipTrunkId` (string | null, optional): Deprecated.
- `sipConnectionInstallationId` (string | null, optional)
- `sipProviderName` (string | null, optional)
- `direction` (string, required): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object | null, optional)
- `agentId` (string | null, optional)
- `label` (string | null, optional)
- `sms10dlcProfileId` (string | null, optional)
- `smsCampaignId` (string | null, optional)
- `smsAssignmentStatus` (string | null, optional)
- `smsAssignmentUpdatedAt` (string (date-time) | null, optional)
- `setupStatus` (object, required)
- `status` (string, required): One of: `"ready"`, `"action_required"`, `"suspended"`.
- `inboundReady` (boolean, required)
- `outboundReady` (boolean, required)
- `agentReady` (boolean, required)
- `forwardingRequired` (boolean, required)
- `sipConnectionReady` (boolean, required)
- `issues` (array of string, required)
- `nextChargeAt` (string (date-time), optional)
- `lastChargedAt` (string (date-time) | null, optional)
- `suspendedAt` (string (date-time) | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
Example:
```json
{
"id": "string",
"organizationId": "string",
"e164": "+12015551234",
"source": "managed",
"providerResourceId": "string",
"telnyxPhoneNumberId": "string",
"sipTrunkId": "string",
"sipConnectionInstallationId": "string",
"sipProviderName": "string",
"direction": "inbound",
"dispatchMetadataTemplate": {},
"agentId": "string",
"label": "string",
"sms10dlcProfileId": "string",
"smsCampaignId": "string",
"smsAssignmentStatus": "string",
"smsAssignmentUpdatedAt": "2026-01-01T00:00:00Z",
"setupStatus": {
"status": "ready",
"inboundReady": true,
"outboundReady": true,
"agentReady": true,
"forwardingRequired": true,
"sipConnectionReady": true,
"issues": [
"string"
]
},
"nextChargeAt": "2026-01-01T00:00:00Z",
"lastChargedAt": "2026-01-01T00:00:00Z",
"suspendedAt": "2026-01-01T00:00:00Z",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/phone-numbers' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"e164":"+12015551234","direction":"inbound","dispatchMetadataTemplate":{},"label":"string","agentId":"string"}'
```
## GET /v1/phone-numbers/available
Search available managed numbers
Searches the platform-managed phone-number pool. Business verification and the buy-phone-numbers feature flag are required.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `areaCode` (string, in query, optional): Pattern: `^\d{3}$`.
- `locality` (string, in query, optional)
- `limit` (integer, in query, optional): Default: `10`. Min: 1. Max: 50.
### Responses
#### 200: Available numbers
Content type: `application/json`
- Items (object):
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `friendlyName` (string, required)
- `monthlyCostUsd` (number, required)
- `upfrontCostUsd` (number, required)
- `features` (array of string, required)
- `region` (object, required)
Example:
```json
[
{
"e164": "+12015551234",
"friendlyName": "string",
"monthlyCostUsd": 0,
"upfrontCostUsd": 0,
"features": [
"string"
],
"region": {}
}
]
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/phone-numbers/available' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## POST /v1/phone-numbers/import
Import a SIP trunk phone number
Registers a customer-owned phone number from an installed SIP connection or a legacy outbound trunk id.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `sipConnectionInstallationId` (string, optional): Installed SIP connection integration id. Preferred for new imports.
- `sipTrunkId` (string, optional): Legacy outbound trunk id. Required when `sipConnectionInstallationId` is omitted.
- `sipProviderName` (string, optional)
- `direction` (string, optional): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object, optional)
- `label` (string, optional)
- `agentId` (string, optional)
Example body:
```json
{
"e164": "+12015551234",
"sipConnectionInstallationId": "string",
"sipTrunkId": "string",
"sipProviderName": "string",
"direction": "inbound",
"dispatchMetadataTemplate": {},
"label": "string",
"agentId": "string"
}
```
### Responses
#### 200: Phone number
Content type: `application/json`
- `id` (string, required)
- `organizationId` (string, required)
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `source` (string, required): One of: `"managed"`, `"sip_trunk"`.
- `providerResourceId` (string | null, optional)
- `telnyxPhoneNumberId` (string | null, optional): Deprecated.
- `sipTrunkId` (string | null, optional): Deprecated.
- `sipConnectionInstallationId` (string | null, optional)
- `sipProviderName` (string | null, optional)
- `direction` (string, required): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object | null, optional)
- `agentId` (string | null, optional)
- `label` (string | null, optional)
- `sms10dlcProfileId` (string | null, optional)
- `smsCampaignId` (string | null, optional)
- `smsAssignmentStatus` (string | null, optional)
- `smsAssignmentUpdatedAt` (string (date-time) | null, optional)
- `setupStatus` (object, required)
- `status` (string, required): One of: `"ready"`, `"action_required"`, `"suspended"`.
- `inboundReady` (boolean, required)
- `outboundReady` (boolean, required)
- `agentReady` (boolean, required)
- `forwardingRequired` (boolean, required)
- `sipConnectionReady` (boolean, required)
- `issues` (array of string, required)
- `nextChargeAt` (string (date-time), optional)
- `lastChargedAt` (string (date-time) | null, optional)
- `suspendedAt` (string (date-time) | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
Example:
```json
{
"id": "string",
"organizationId": "string",
"e164": "+12015551234",
"source": "managed",
"providerResourceId": "string",
"telnyxPhoneNumberId": "string",
"sipTrunkId": "string",
"sipConnectionInstallationId": "string",
"sipProviderName": "string",
"direction": "inbound",
"dispatchMetadataTemplate": {},
"agentId": "string",
"label": "string",
"sms10dlcProfileId": "string",
"smsCampaignId": "string",
"smsAssignmentStatus": "string",
"smsAssignmentUpdatedAt": "2026-01-01T00:00:00Z",
"setupStatus": {
"status": "ready",
"inboundReady": true,
"outboundReady": true,
"agentReady": true,
"forwardingRequired": true,
"sipConnectionReady": true,
"issues": [
"string"
]
},
"nextChargeAt": "2026-01-01T00:00:00Z",
"lastChargedAt": "2026-01-01T00:00:00Z",
"suspendedAt": "2026-01-01T00:00:00Z",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/phone-numbers/import' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"e164":"+12015551234","sipConnectionInstallationId":"string","sipTrunkId":"string","sipProviderName":"string","direction":"inbound","dispatchMetadataTemplate":{},"label":"string","agentId":"string"}'
```
## GET /v1/phone-numbers/kyb
Get phone-number business verification
Returns the latest business verification submission and optional 10DLC-derived prefill.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Responses
#### 200: Business verification state
Content type: `application/json`
- `status` (string, required): One of: `"missing"`, `"draft"`, `"submitted"`, `"approved"`, `"rejected"`, `"revoked"`.
- `submission` (object, required)
- `id` (string, required)
- `organizationId` (string, required)
- `status` (string, required): One of: `"draft"`, `"submitted"`, `"approved"`, `"rejected"`, `"revoked"`.
- `businessProfile` (object, optional)
- `legalName` (string, required)
- `displayName` (string, required)
- `entityType` (string, required)
- `country` (string, required)
- `registrationId` (string, optional)
- `website` (string (uri), required)
- `address` (object, required)
- `useCase` (string, required)
- `expectedUsage` (string, required)
- `authorizedRepresentative` (object, optional)
- `name` (string, required)
- `title` (string, required)
- `email` (string (email), required)
- `phone` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
- `attestationAccepted` (boolean, required)
- `attestedAt` (string (date-time) | null, optional)
- `submittedAt` (string (date-time) | null, optional)
- `rejectionReason` (string | null, optional)
- `slackNotificationStatus` (string, optional): One of: `"not_queued"`, `"queued"`, `"enqueue_failed"`.
- `slackNotificationJobId` (string | null, optional)
- `slackNotificationError` (string | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
- `prefill` (object | null, required)
- `businessProfile` (object, optional)
- `legalName` (string, required)
- `displayName` (string, required)
- `entityType` (string, required)
- `country` (string, required)
- `registrationId` (string, optional)
- `website` (string (uri), required)
- `address` (object, required)
- `useCase` (string, required)
- `expectedUsage` (string, required)
- `authorizedRepresentative` (object, optional)
- `name` (string, required)
- `title` (string, required)
- `email` (string (email), required)
- `phone` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
Example:
```json
{
"status": "missing",
"submission": {
"id": "string",
"organizationId": "string",
"status": "draft",
"businessProfile": {
"legalName": "string",
"displayName": "string",
"entityType": "string",
"country": "string",
"registrationId": "string",
"website": "string",
"address": {},
"useCase": "string",
"expectedUsage": "string"
},
"authorizedRepresentative": {
"name": "string",
"title": "string",
"email": "string",
"phone": "+12015551234"
},
"attestationAccepted": true,
"attestedAt": "2026-01-01T00:00:00Z",
"submittedAt": "2026-01-01T00:00:00Z",
"rejectionReason": "string",
"slackNotificationStatus": "not_queued",
"slackNotificationJobId": "string",
"slackNotificationError": "string",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
},
"prefill": {
"businessProfile": {
"legalName": "string",
"displayName": "string",
"entityType": "string",
"country": "string",
"registrationId": "string",
"website": "string",
"address": {},
"useCase": "string",
"expectedUsage": "string"
},
"authorizedRepresentative": {
"name": "string",
"title": "string",
"email": "string",
"phone": "+12015551234"
}
}
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/phone-numbers/kyb' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## PUT /v1/phone-numbers/kyb/draft
Save phone-number business verification draft
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `businessProfile` (object, required)
- `legalName` (string, required)
- `displayName` (string, required)
- `entityType` (string, required)
- `country` (string, required)
- `registrationId` (string, optional)
- `website` (string (uri), required)
- `address` (object, required)
- `useCase` (string, required)
- `expectedUsage` (string, required)
- `authorizedRepresentative` (object, required)
- `name` (string, required)
- `title` (string, required)
- `email` (string (email), required)
- `phone` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
- `attestationAccepted` (boolean, optional): Default: `false`.
Example body:
```json
{
"businessProfile": {
"legalName": "string",
"displayName": "string",
"entityType": "string",
"country": "string",
"registrationId": "string",
"website": "string",
"address": {},
"useCase": "string",
"expectedUsage": "string"
},
"authorizedRepresentative": {
"name": "string",
"title": "string",
"email": "string",
"phone": "+12015551234"
},
"attestationAccepted": false
}
```
### Responses
#### 200: Saved draft
Content type: `application/json`
- `id` (string, required)
- `organizationId` (string, required)
- `status` (string, required): One of: `"draft"`, `"submitted"`, `"approved"`, `"rejected"`, `"revoked"`.
- `businessProfile` (object, optional)
- `legalName` (string, required)
- `displayName` (string, required)
- `entityType` (string, required)
- `country` (string, required)
- `registrationId` (string, optional)
- `website` (string (uri), required)
- `address` (object, required)
- `useCase` (string, required)
- `expectedUsage` (string, required)
- `authorizedRepresentative` (object, optional)
- `name` (string, required)
- `title` (string, required)
- `email` (string (email), required)
- `phone` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
- `attestationAccepted` (boolean, required)
- `attestedAt` (string (date-time) | null, optional)
- `submittedAt` (string (date-time) | null, optional)
- `rejectionReason` (string | null, optional)
- `slackNotificationStatus` (string, optional): One of: `"not_queued"`, `"queued"`, `"enqueue_failed"`.
- `slackNotificationJobId` (string | null, optional)
- `slackNotificationError` (string | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
Example:
```json
{
"id": "string",
"organizationId": "string",
"status": "draft",
"businessProfile": {
"legalName": "string",
"displayName": "string",
"entityType": "string",
"country": "string",
"registrationId": "string",
"website": "string",
"address": {},
"useCase": "string",
"expectedUsage": "string"
},
"authorizedRepresentative": {
"name": "string",
"title": "string",
"email": "string",
"phone": "+12015551234"
},
"attestationAccepted": true,
"attestedAt": "2026-01-01T00:00:00Z",
"submittedAt": "2026-01-01T00:00:00Z",
"rejectionReason": "string",
"slackNotificationStatus": "not_queued",
"slackNotificationJobId": "string",
"slackNotificationError": "string",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X PUT 'https://api.speko.dev/v1/phone-numbers/kyb/draft' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"businessProfile":{"legalName":"string","displayName":"string","entityType":"string","country":"string","registrationId":"string","website":"string","address":{},"useCase":"string","expectedUsage":"string"},"authorizedRepresentative":{"name":"string","title":"string","email":"string","phone":"+12015551234"},"attestationAccepted":false}'
```
## POST /v1/phone-numbers/kyb/submit
Submit phone-number business verification
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `attestationAccepted` (boolean, required): Default: `false`.
- `businessProfile` (object, required)
- `legalName` (string, required)
- `displayName` (string, required)
- `entityType` (string, required)
- `country` (string, required)
- `registrationId` (string, optional)
- `website` (string (uri), required)
- `address` (object, required)
- `useCase` (string, required)
- `expectedUsage` (string, required)
- `authorizedRepresentative` (object, required)
- `name` (string, required)
- `title` (string, required)
- `email` (string (email), required)
- `phone` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
Example body:
```json
{
"attestationAccepted": false,
"businessProfile": {
"legalName": "string",
"displayName": "string",
"entityType": "string",
"country": "string",
"registrationId": "string",
"website": "string",
"address": {},
"useCase": "string",
"expectedUsage": "string"
},
"authorizedRepresentative": {
"name": "string",
"title": "string",
"email": "string",
"phone": "+12015551234"
}
}
```
### Responses
#### 200: Submitted verification
Content type: `application/json`
- `id` (string, required)
- `organizationId` (string, required)
- `status` (string, required): One of: `"draft"`, `"submitted"`, `"approved"`, `"rejected"`, `"revoked"`.
- `businessProfile` (object, optional)
- `legalName` (string, required)
- `displayName` (string, required)
- `entityType` (string, required)
- `country` (string, required)
- `registrationId` (string, optional)
- `website` (string (uri), required)
- `address` (object, required)
- `useCase` (string, required)
- `expectedUsage` (string, required)
- `authorizedRepresentative` (object, optional)
- `name` (string, required)
- `title` (string, required)
- `email` (string (email), required)
- `phone` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
- `attestationAccepted` (boolean, required)
- `attestedAt` (string (date-time) | null, optional)
- `submittedAt` (string (date-time) | null, optional)
- `rejectionReason` (string | null, optional)
- `slackNotificationStatus` (string, optional): One of: `"not_queued"`, `"queued"`, `"enqueue_failed"`.
- `slackNotificationJobId` (string | null, optional)
- `slackNotificationError` (string | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
Example:
```json
{
"id": "string",
"organizationId": "string",
"status": "draft",
"businessProfile": {
"legalName": "string",
"displayName": "string",
"entityType": "string",
"country": "string",
"registrationId": "string",
"website": "string",
"address": {},
"useCase": "string",
"expectedUsage": "string"
},
"authorizedRepresentative": {
"name": "string",
"title": "string",
"email": "string",
"phone": "+12015551234"
},
"attestationAccepted": true,
"attestedAt": "2026-01-01T00:00:00Z",
"submittedAt": "2026-01-01T00:00:00Z",
"rejectionReason": "string",
"slackNotificationStatus": "not_queued",
"slackNotificationJobId": "string",
"slackNotificationError": "string",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/phone-numbers/kyb/submit' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"attestationAccepted":false,"businessProfile":{"legalName":"string","displayName":"string","entityType":"string","country":"string","registrationId":"string","website":"string","address":{},"useCase":"string","expectedUsage":"string"},"authorizedRepresentative":{"name":"string","title":"string","email":"string","phone":"+12015551234"}}'
```
## GET /v1/phone-numbers/{id}
Get a phone number
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Phone number
Content type: `application/json`
- `id` (string, required)
- `organizationId` (string, required)
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `source` (string, required): One of: `"managed"`, `"sip_trunk"`.
- `providerResourceId` (string | null, optional)
- `telnyxPhoneNumberId` (string | null, optional): Deprecated.
- `sipTrunkId` (string | null, optional): Deprecated.
- `sipConnectionInstallationId` (string | null, optional)
- `sipProviderName` (string | null, optional)
- `direction` (string, required): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object | null, optional)
- `agentId` (string | null, optional)
- `label` (string | null, optional)
- `sms10dlcProfileId` (string | null, optional)
- `smsCampaignId` (string | null, optional)
- `smsAssignmentStatus` (string | null, optional)
- `smsAssignmentUpdatedAt` (string (date-time) | null, optional)
- `setupStatus` (object, required)
- `status` (string, required): One of: `"ready"`, `"action_required"`, `"suspended"`.
- `inboundReady` (boolean, required)
- `outboundReady` (boolean, required)
- `agentReady` (boolean, required)
- `forwardingRequired` (boolean, required)
- `sipConnectionReady` (boolean, required)
- `issues` (array of string, required)
- `nextChargeAt` (string (date-time), optional)
- `lastChargedAt` (string (date-time) | null, optional)
- `suspendedAt` (string (date-time) | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
Example:
```json
{
"id": "string",
"organizationId": "string",
"e164": "+12015551234",
"source": "managed",
"providerResourceId": "string",
"telnyxPhoneNumberId": "string",
"sipTrunkId": "string",
"sipConnectionInstallationId": "string",
"sipProviderName": "string",
"direction": "inbound",
"dispatchMetadataTemplate": {},
"agentId": "string",
"label": "string",
"sms10dlcProfileId": "string",
"smsCampaignId": "string",
"smsAssignmentStatus": "string",
"smsAssignmentUpdatedAt": "2026-01-01T00:00:00Z",
"setupStatus": {
"status": "ready",
"inboundReady": true,
"outboundReady": true,
"agentReady": true,
"forwardingRequired": true,
"sipConnectionReady": true,
"issues": [
"string"
]
},
"nextChargeAt": "2026-01-01T00:00:00Z",
"lastChargedAt": "2026-01-01T00:00:00Z",
"suspendedAt": "2026-01-01T00:00:00Z",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/phone-numbers/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
## PATCH /v1/phone-numbers/{id}
Update a phone number
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Request body
Required.
Content types: `application/json`
- `direction` (string, optional): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object | null, optional)
- `label` (string | null, optional)
- `agentId` (string | null, optional)
Example body:
```json
{
"direction": "inbound",
"dispatchMetadataTemplate": {},
"label": "string",
"agentId": "string"
}
```
### Responses
#### 200: Phone number
Content type: `application/json`
- `id` (string, required)
- `organizationId` (string, required)
- `e164` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `source` (string, required): One of: `"managed"`, `"sip_trunk"`.
- `providerResourceId` (string | null, optional)
- `telnyxPhoneNumberId` (string | null, optional): Deprecated.
- `sipTrunkId` (string | null, optional): Deprecated.
- `sipConnectionInstallationId` (string | null, optional)
- `sipProviderName` (string | null, optional)
- `direction` (string, required): One of: `"inbound"`, `"outbound"`, `"both"`.
- `dispatchMetadataTemplate` (object | null, optional)
- `agentId` (string | null, optional)
- `label` (string | null, optional)
- `sms10dlcProfileId` (string | null, optional)
- `smsCampaignId` (string | null, optional)
- `smsAssignmentStatus` (string | null, optional)
- `smsAssignmentUpdatedAt` (string (date-time) | null, optional)
- `setupStatus` (object, required)
- `status` (string, required): One of: `"ready"`, `"action_required"`, `"suspended"`.
- `inboundReady` (boolean, required)
- `outboundReady` (boolean, required)
- `agentReady` (boolean, required)
- `forwardingRequired` (boolean, required)
- `sipConnectionReady` (boolean, required)
- `issues` (array of string, required)
- `nextChargeAt` (string (date-time), optional)
- `lastChargedAt` (string (date-time) | null, optional)
- `suspendedAt` (string (date-time) | null, optional)
- `createdAt` (string (date-time), required)
- `updatedAt` (string (date-time), required)
Example:
```json
{
"id": "string",
"organizationId": "string",
"e164": "+12015551234",
"source": "managed",
"providerResourceId": "string",
"telnyxPhoneNumberId": "string",
"sipTrunkId": "string",
"sipConnectionInstallationId": "string",
"sipProviderName": "string",
"direction": "inbound",
"dispatchMetadataTemplate": {},
"agentId": "string",
"label": "string",
"sms10dlcProfileId": "string",
"smsCampaignId": "string",
"smsAssignmentStatus": "string",
"smsAssignmentUpdatedAt": "2026-01-01T00:00:00Z",
"setupStatus": {
"status": "ready",
"inboundReady": true,
"outboundReady": true,
"agentReady": true,
"forwardingRequired": true,
"sipConnectionReady": true,
"issues": [
"string"
]
},
"nextChargeAt": "2026-01-01T00:00:00Z",
"lastChargedAt": "2026-01-01T00:00:00Z",
"suspendedAt": "2026-01-01T00:00:00Z",
"createdAt": "2026-01-01T00:00:00Z",
"updatedAt": "2026-01-01T00:00:00Z"
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X PATCH 'https://api.speko.dev/v1/phone-numbers/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"direction":"inbound","dispatchMetadataTemplate":{},"label":"string","agentId":"string"}'
```
## DELETE /v1/phone-numbers/{id}
Release a phone number
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `id` (string, in path, required): Resource id.
### Responses
#### 200: Release result
Content type: `application/json`
- `released` (boolean, required)
Example:
```json
{
"released": true
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X DELETE 'https://api.speko.dev/v1/phone-numbers/{id}' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
# Create phone session (/api-reference/sessions-phone)
POST /v1/sessions/phone — place an outbound PSTN call backed by a Speko voice session.
## POST /v1/sessions/phone
Place an outbound phone call
Creates a voice session, dispatches the configured agent worker, and dials the destination over LiveKit SIP. Supply either `agentId` or `intent`.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `to` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `from` (string, optional): Pattern: `^\+[1-9]\d{6,14}$`.
- `agentId` (string, optional): Persisted agent to run. Required unless `intent` is supplied.
- `intent` (object, optional)
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`. Pattern: `^[a-z]{2}(-[A-Z]{2})?$`.
- `region` (string, optional): Region whose streaming-latency measurements should be used. `global` falls through to batch-mode rows when no per-region data matches. One of: `"global"`, `"us-east4"`, `"europe-west3"`, `"asia-southeast1"`. Default: `"global"`.
- `optimizeFor` (string, optional): One of: `"balanced"`, `"accuracy"`, `"latency"`, `"cost"`. Default: `"balanced"`.
- `constraints` (object, optional)
- `allowedProviders` (object, optional): Restrict candidate pool per modality. Router still ranks by score. Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional)
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional)
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional)
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
- `s2s` (array of string, optional)
- Allowed S2S provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI realtime model) or `":"` (e.g. `"openai:gpt-realtime"` — only that model). Both forms can be mixed in the same array.
- `voice` (string, optional)
- `systemPrompt` (string, optional)
- `firstMessage` (string, optional)
- `llm` (object, optional)
- `ttsOptions` (object, optional)
- `sttOptions` (object, optional)
- `telephony` (object, optional)
- `region` (string, optional): Optional SIP routing region hint forwarded as `X-Speko-Region`.
- `amd` (object, optional)
- `mode` (string, optional): One of: `"agent"`, `"carrier"`, `"disabled"`. Default: `"agent"`.
- `timeoutSeconds` (integer, optional): Min: 1. Max: 60.
- `metadata` (object, optional)
Example body:
```json
{
"to": "+12015551234",
"from": "+12015551234",
"agentId": "string",
"intent": {
"language": "en-US",
"region": "global",
"optimizeFor": "balanced"
},
"constraints": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
],
"s2s": [
"openai"
]
}
},
"voice": "string",
"systemPrompt": "string",
"firstMessage": "string",
"llm": {},
"ttsOptions": {},
"sttOptions": {},
"telephony": {
"region": "string",
"amd": {
"mode": "agent",
"timeoutSeconds": 0
}
},
"metadata": {}
}
```
### Responses
#### 200: Call dialing
Content type: `application/json`
- `sessionId` (string, required)
- `callControlId` (string, required): LiveKit SIP participant identity for the outbound leg.
- `roomName` (string, required)
- `status` (string, required): One of: `"dialing"`, `"dialing-stub"`.
- `to` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
- `from` (string, required): Pattern: `^\+[1-9]\d{6,14}$`.
Example:
```json
{
"sessionId": "string",
"callControlId": "string",
"roomName": "string",
"status": "dialing",
"to": "+12015551234",
"from": "+12015551234"
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 404: No agent with this id exists in the authenticated organization.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "not found",
"code": "NOT_FOUND"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/sessions/phone' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"to":"+12015551234","from":"+12015551234","agentId":"string","intent":{"language":"en-US","region":"global","optimizeFor":"balanced"},"constraints":{"allowedProviders":{"stt":["deepgram"],"llm":["openai"],"tts":["elevenlabs"],"s2s":["openai"]}},"voice":"string","systemPrompt":"string","firstMessage":"string","llm":{},"ttsOptions":{},"sttOptions":{},"telephony":{"region":"string","amd":{"mode":"agent","timeoutSeconds":0}},"metadata":{}}'
```
# Sessions (/api-reference/sessions)
Mints browser-safe media transport credentials, persists the pipeline config, and dispatches an agent worker. Use the returned `transportToken` and `transportUrl` with `@spekoai/client` to join from a browser.
## POST /v1/sessions
Create a real-time voice session
Mints browser-safe media transport credentials, persists the pipeline config, and dispatches an agent worker. Use the returned `transportToken` and `transportUrl` with `@spekoai/client` to join from a browser.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `agentId` (string, optional): Optional pointer to a persisted `agent` row whose fields seed this session. When supplied, the agent's `systemPrompt`, `voice`, `intent`, and `llmOptions` become defaults; any per-call field on this body overrides the agent's stored value. When absent, `intent` is required.
- `intent` (object, optional)
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`. Pattern: `^[a-z]{2}(-[A-Z]{2})?$`.
- `region` (string, optional): Region whose streaming-latency measurements should be used. `global` falls through to batch-mode rows when no per-region data matches. One of: `"global"`, `"us-east4"`, `"europe-west3"`, `"asia-southeast1"`. Default: `"global"`.
- `optimizeFor` (string, optional): One of: `"balanced"`, `"accuracy"`, `"latency"`, `"cost"`. Default: `"balanced"`.
- `constraints` (object, optional)
- `allowedProviders` (object, optional): Restrict candidate pool per modality. Router still ranks by score. Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional)
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional)
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional)
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
- `s2s` (array of string, optional)
- Allowed S2S provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI realtime model) or `":"` (e.g. `"openai:gpt-realtime"` — only that model). Both forms can be mixed in the same array.
- `voice` (string, optional): Force a specific TTS voice id.
- `systemPrompt` (string, optional): Initial agent instructions.
- `llm` (object, optional)
- `temperature` (number, optional): Min: 0. Max: 2.
- `maxTokens` (integer, optional): Min: 1.
- `ttsOptions` (object, optional)
- `sampleRate` (integer, optional): Min: 1.
- `speed` (number, optional): Min: 0.5. Max: 2.
- `metadata` (object, optional): Arbitrary key/value pairs stored on the session row.
- `ttlSeconds` (integer, optional): Token TTL. Default: `900`. Min: 1. Max: 86400.
- `identity` (string, optional): Transport participant identity. Defaults to `user_`.
Example body:
```json
{
"agentId": "string",
"intent": {
"language": "en-US",
"region": "global",
"optimizeFor": "balanced"
},
"constraints": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
],
"s2s": [
"openai"
]
}
},
"voice": "string",
"systemPrompt": "string",
"llm": {
"temperature": 0,
"maxTokens": 0
},
"ttsOptions": {
"sampleRate": 0,
"speed": 0
},
"metadata": {},
"ttlSeconds": 900,
"identity": "string"
}
```
### Responses
#### 201: Session created
Content type: `application/json`
- `sessionId` (string (uuid), required)
- `transportToken` (string, required): Browser-safe media transport token.
- `transportUrl` (string (uri), required): Media transport URL.
- `conversationToken` (string, required): Compatibility alias for `transportToken`. Deprecated.
- `livekitUrl` (string (uri), required): Compatibility alias for `transportUrl`. Deprecated.
- `roomName` (string, required)
- `identity` (string, required)
- `expiresAt` (string (date-time), required)
Example:
```json
{
"sessionId": "string",
"transportToken": "string",
"transportUrl": "string",
"conversationToken": "string",
"livekitUrl": "string",
"roomName": "speko_...",
"identity": "string",
"expiresAt": "2026-01-01T00:00:00Z"
}
```
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 500: Token mint failed, agent dispatch failed, or DB insert failed.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Failed to dispatch voice agent",
"code": "SESSION_CREATE_FAILED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/sessions' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"agentId":"string","intent":{"language":"en-US","region":"global","optimizeFor":"balanced"},"constraints":{"allowedProviders":{"stt":["deepgram"],"llm":["openai"],"tts":["elevenlabs"],"s2s":["openai"]}},"voice":"string","systemPrompt":"string","llm":{"temperature":0,"maxTokens":0},"ttsOptions":{"sampleRate":0,"speed":0},"metadata":{},"ttlSeconds":900,"identity":"string"}'
```
# Synthesize (/api-reference/synthesize)
Routes the request to the best TTS provider for your intent, with automatic failover. Returns binary audio. The `Content-Type` header (mirrored in `X-Speko-Audio-Format`) tells you the format.
## POST /v1/synthesize
Synthesize speech (text to speech)
Routes the request to the best TTS provider for your intent, with automatic failover before the first audio chunk. Returns chunked binary audio. The `Content-Type` header (mirrored in `X-Speko-Audio-Format`) tells you the format.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Request body
Required.
Content types: `application/json`
- `text` (string, required)
- `intent` (object, required)
- `language` (string, required): BCP-47 language tag, e.g. `en`, `en-US`, `es-MX`. Pattern: `^[a-z]{2}(-[A-Z]{2})?$`.
- `region` (string, optional): Region whose streaming-latency measurements should be used. `global` falls through to batch-mode rows when no per-region data matches. One of: `"global"`, `"us-east4"`, `"europe-west3"`, `"asia-southeast1"`. Default: `"global"`.
- `optimizeFor` (string, optional): One of: `"balanced"`, `"accuracy"`, `"latency"`, `"cost"`. Default: `"balanced"`.
- `voice` (string, optional): Provider-specific voice id. Speko falls back to a sane default per provider when omitted.
- `model` (string, optional): Optional upstream model name (e.g. `eleven_multilingual_v2`, `sonic-2`, `gpt-4o-mini-tts`, `qwen3-tts-flash`). When set, overrides the selector's chosen model on the primary candidate only — failover candidates still use the selector's model so a model intended for provider A is not sent to provider B.
- `speed` (number, optional): Min: 0.5. Max: 2.
- `constraints` (object, optional)
- `allowedProviders` (object, optional): Restrict candidate pool per modality. Router still ranks by score. Each entry is either `""` (vendor wildcard — allow any model from that vendor) or `":"` (allow only that specific model). Failover stays active across all entries in the layer.
- `stt` (array of string, optional)
- Allowed STT provider entry. Either a vendor id (e.g. `"deepgram"` — any Deepgram model) or `":"` (e.g. `"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Both forms can be mixed in the same array.
- `llm` (array of string, optional)
- Allowed LLM provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI model) or `":"` (e.g. `"openai:gpt-5"` — only that model). Both forms can be mixed in the same array.
- `tts` (array of string, optional)
- Allowed TTS provider entry. Either a vendor id (e.g. `"elevenlabs"` — any ElevenLabs model) or `":"` (e.g. `"elevenlabs:eleven_flash_v2_5"` — only that model). Both forms can be mixed in the same array.
- `s2s` (array of string, optional)
- Allowed S2S provider entry. Either a vendor id (e.g. `"openai"` — any OpenAI realtime model) or `":"` (e.g. `"openai:gpt-realtime"` — only that model). Both forms can be mixed in the same array.
Example body:
```json
{
"text": "string",
"intent": {
"language": "en-US",
"region": "global",
"optimizeFor": "balanced"
},
"voice": "string",
"model": "string",
"speed": 0,
"constraints": {
"allowedProviders": {
"stt": [
"deepgram"
],
"llm": [
"openai"
],
"tts": [
"elevenlabs"
],
"s2s": [
"openai"
]
}
}
}
```
### Responses
#### 200: Chunked audio bytes
Content type: `application/octet-stream`
#### 400: Request body or query parameters failed validation.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Invalid request body",
"code": "VALIDATION_ERROR"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 500: No TTS provider available for the intent, no default voice for chosen provider, or every candidate failed.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "All providers failed",
"code": "SYNTHESIZE_FAILED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/synthesize' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'Content-Type: application/json' \
-d '{"text":"string","intent":{"language":"en-US","region":"global","optimizeFor":"balanced"},"voice":"string","model":"string","speed":0,"constraints":{"allowedProviders":{"stt":["deepgram"],"llm":["openai"],"tts":["elevenlabs"],"s2s":["openai"]}}}'
```
# Transcribe (/api-reference/transcribe)
Routes the request to the best STT provider for your `(language, region, optimizeFor)` intent, with automatic failover to runner-up providers. Body is binary audio. Routing intent goes in the `x-speko-intent` header (JSON).
## POST /v1/transcribe
Transcribe audio (speech to text)
Routes the request to the best STT provider for your `(language, region, optimizeFor)` intent, with automatic failover to runner-up providers before the first event is emitted. Body is binary audio. Routing intent goes in the `x-speko-intent` header (JSON). The response is a Server-Sent Events stream: `meta`, `transcript`, `done`, and `error`.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `x-speko-intent` (string, in header, required): JSON-encoded `RoutingIntent`. Example: `{"language":"en-US","region":"global"}`.
- `x-speko-constraints` (string, in header, optional): Optional JSON-encoded `PipelineConstraints`. Each `allowedProviders` entry is either `""` (any model from that vendor) or `":"` (a specific model). Example: `{"allowedProviders":{"stt":["deepgram:nova-3"]}}`.
- `x-speko-stt-options` (string, in header, optional): Optional JSON-encoded STT options. Example: `{"keywords":["Speko","Ava Martinez"]}`.
- `Content-Type` (string, in header, optional): Audio MIME type, e.g. `audio/wav`, `audio/mpeg`, `audio/pcm;rate=16000`.
### Request body
Required. Binary audio. WAV, MP3, or raw PCM accepted.
Content types: `application/octet-stream`, `audio/wav`, `audio/mpeg`
### Responses
#### 200: Transcript event stream
Content type: `text/event-stream`
SSE events: `meta` with provider/model, zero or more `transcript` events, final `done` containing a `TranscribeResponse`, or `error` after streaming has started.
#### 400: Empty body, missing/invalid `x-speko-intent` header, or malformed `x-speko-constraints`.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Audio body is empty.",
"code": "INVALID_AUDIO"
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
#### 500: No STT provider available for the intent, or every candidate failed.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "All providers failed",
"code": "TRANSCRIBE_FAILED"
}
```
### Example request
```bash
curl -X POST 'https://api.speko.dev/v1/transcribe' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY' \
-H 'x-speko-intent: {"language":"en-US","region":"global"}' \
-H 'Content-Type: audio/wav' \
--data-binary '@audio.wav'
```
# Voices (/api-reference/voices)
Read-only catalog of TTS voices grouped by provider. ElevenLabs voices are account-scoped and fetched live from ElevenLabs rather than returned here.
## GET /v1/voices
List TTS voices
Returns the curated catalog of TTS voices grouped by provider, plus the list of TTS providers Speko routes to. ElevenLabs is included in `providers` with `voicesFetchedLive: true` because its voice library is account-scoped — fetch it directly from `https://api.elevenlabs.io/v1/voices` with the org's key.
Auth: required. Send `Authorization: Bearer ` (API key (sk_live_...)).
### Parameters
- `provider` (string, in query, optional): Filter to a single provider's voices. Accepts either the routing key (`cartesia`, `xai`, `alibaba`, `openai`, `inworld`, `elevenlabs`) or the catalog suffix form (`xai-tts`, `alibaba-tts`, `openai-tts`).
### Responses
#### 200: Voice catalog
Content type: `application/json`
- `voices` (array of object, required)
- `vendor` (string, required): Routing-key vendor (matches `allowedProviders.tts` entries).
- `id` (string, required): Voice id passed through to the provider's TTS API.
- `name` (string, required): Human-readable label.
- `providers` (array of object, required)
- `key` (string, required)
- `name` (string, required)
- `models` (array of string, required)
- `voicesFetchedLive` (boolean, required): `true` when the provider's voice library is account-scoped and must be fetched live from the provider (currently ElevenLabs).
Example:
```json
{
"voices": [
{
"vendor": "string",
"id": "string",
"name": "string"
}
],
"providers": [
{
"key": "string",
"name": "string",
"models": [
"string"
],
"voicesFetchedLive": true
}
]
}
```
#### 401: Missing or invalid bearer token.
Content type: `application/json`
- `error` (string, required): Human-readable message.
- `code` (string, required): Machine-readable code.
Example:
```json
{
"error": "Unauthorized",
"code": "UNAUTHORIZED"
}
```
### Example request
```bash
curl -X GET 'https://api.speko.dev/v1/voices' \
-H 'Authorization: Bearer YOUR_SPEKO_API_KEY'
```
# Agents (/concepts/agents)
Reusable voice personas — system prompt, voice, and intent defaults that any session can be started against.
An **Agent** is a persisted persona record on Speko. It bundles the values you'd otherwise have to repeat on every session create:
* `systemPrompt` — the agent's instructions
* `voice` — TTS voice id (provider-specific)
* `intent` — `language` and optional `optimizeFor` (`latency` / `quality` / `cost`)
* `llmOptions` — optional `temperature`, `maxTokens`, `model`
* `stackPreferences` — optional per-layer allowlists (which STT / LLM / TTS providers the router is allowed to pick)
Once created, an agent is referenced by id (e.g. `agent_a1b2c3d4e5f60718`) when starting a voice session. Per-call body fields still override the agent's stored defaults — agents are sensible defaults, not a wall.
## When to use an agent [#when-to-use-an-agent]
Use one when the same persona is starting more than one session — that is, almost always in production. Without an agent, every `POST /v1/sessions` ships the full config inline; with an agent, you ship just `{ agentId }`.
```diff
- POST /v1/sessions
- { "intent": { "language": "en" }, "voice": "...", "systemPrompt": "...", ... }
+ POST /v1/sessions
+ { "agentId": "agent_a1b2c3d4e5f60718" }
```
Per-call overrides are still allowed and still win, so you can pin the persona without giving up per-call tweaks (a different voice for a single VIP user, a longer system prompt for a high-stakes call, etc.).
## Stack preferences [#stack-preferences]
Speko's default behavior is auto-routing — for each session, the highest-scoring provider per pipeline layer (STT, LLM, TTS) is picked from every benchmarked option. **Stack preferences** narrow that pool per agent, so you can express opinions like "this agent's STT must be Deepgram or AssemblyAI" or "LLM must be OpenAI or Anthropic" without giving up auto-routing or failover within the allowed set.
```json
{
"stackPreferences": {
"allowedProviders": {
"stt": ["deepgram", "assemblyai"],
"llm": ["openai", "anthropic"],
"tts": ["cartesia", "elevenlabs"]
}
}
}
```
### Vendor vs model entries [#vendor-vs-model-entries]
Each entry in an `allowedProviders` list is either a vendor id (allow any model from that vendor) or `":"` (allow only that specific model). Both forms can be mixed in the same array. Failover stays active across all entries in the layer.
```json
{
"stackPreferences": {
"allowedProviders": {
"stt": ["deepgram:nova-3", "assemblyai"],
"llm": ["openai:gpt-5", "anthropic"],
"tts": ["elevenlabs:eleven_flash_v2_5"]
}
}
}
```
Reading that example: STT must be Deepgram's Nova-3 specifically, or any model from AssemblyAI. LLM must be OpenAI's GPT-5 specifically, or any model from Anthropic. TTS must be ElevenLabs' `eleven_flash_v2_5` and nothing else.
`["deepgram:nova-3", "deepgram:nova-2"]` is also valid — that pins to two specific Deepgram models, with no fallback to other Deepgram models within the vendor (failover to other allowed vendors still applies). Use `GET /v1/providers/known` to enumerate every `(vendor, model)` pair the router knows about — the `id` field on each entry is the verbatim string to put here.
Rules of thumb:
* **Empty / missing layer = no constraint.** A `stt` allowlist of `[]` (or absent entirely) means the router has full freedom for STT.
* **Failover stays active within the allowed set.** If you allowlist Deepgram + AssemblyAI for STT, and Deepgram errors mid-call, the router will fail over to AssemblyAI rather than to a non-allowed provider. The same holds across model-level entries — `["deepgram:nova-3", "assemblyai"]` will fail over from Nova-3 to AssemblyAI.
* **Per-call overrides win per layer.** When you start a session with both an `agentId` *and* an inline `constraints.allowedProviders.stt`, the per-call `stt` list replaces the agent's `stt` list — but the agent's `llm` and `tts` lists still apply unless those are also overridden.
* **Empty allowlist with no fallback fails to route.** If you allowlist a provider that has no benchmark data, the router can't rank it; the session will error rather than silently picking an alternative.
Stack preferences are configured per agent — either via the dashboard's **Stack preferences** section on the Persona tab, or via `stackPreferences` in `POST /v1/agents` / `PATCH /v1/agents/{id}`.
## Session history [#session-history]
Every session started with `agentId` in `POST /v1/sessions` is linked to that agent at create time. The dashboard surfaces them under the **Sessions** tab on the agent's detail page; the API equivalent is `GET /v1/sessions?agent=`, which scopes the standard sessions list down to that agent's calls and combines with the existing `cursor`, `status`, `kind`, and `limit` filters.
The link is read-only — sessions are not retroactively attached to an agent, only at create time.
## Tools live under agents [#tools-live-under-agents]
Every tool you register is scoped to a specific agent — the unique key is `(organization, agentId, toolName)`. This means:
* Different agents in the same org can register tools with the same name (`lookup_user`, `book_appointment`) without colliding.
* Adding a tool to one agent doesn't expose it to your other agents.
* The voice worker fetches `GET /v1/agents/{agentId}/tools` at session start to assemble the tool list.
Manage tools from the **Tools** tab inside an agent's detail page in the dashboard, or via the `/v1/agents/{id}/tools` API.
## Lifecycle [#lifecycle]
| Operation | Endpoint | Notes |
| --------- | ------------------------ | ------------------------------------------------------------------------------------------ |
| Create | `POST /v1/agents` | Returns the new id. Names are unique per organization. |
| List | `GET /v1/agents` | Returns every agent in the calling key's organization. |
| Read | `GET /v1/agents/{id}` | 404 if not found / not in your org. |
| Update | `PATCH /v1/agents/{id}` | Partial — send only the fields you want to change. |
| Delete | `DELETE /v1/agents/{id}` | Also removes the agent's tools. The organization's only remaining agent cannot be deleted. |
## The `Default` agent [#the-default-agent]
When your organization is created, Speko seeds a single agent named `Default` so you have something to point at on day one. It's a regular agent in every respect — rename it, edit its system prompt, voice, and intent, or delete it after you create another agent. Speko only prevents deleting the last remaining agent in an organization.
## Where to next [#where-to-next]
Browser-side WebRTC using a short-lived session token minted by your server.
Manage agents and their tools from the web dashboard.
Full request/response shapes for the `/v1/agents` endpoints.
# Bring your own keys (/concepts/byok)
Use your own provider credentials. Speko routes; providers bill you directly.
By default, calls run against Speko-managed provider credentials and roll up to a single Speko bill. With BYOK (Bring Your Own Keys) you supply your own API keys per provider — Speko still picks the best provider per call, but the provider charges *you*.
## Why BYOK [#why-byok]
* **Existing volume discount.** You already have negotiated rates with Deepgram, OpenAI, ElevenLabs, etc.
* **Compliance.** You hold the contract / DPA / BAA with the provider directly.
* **Spend visibility.** Provider invoices land in your existing billing.
## Configure [#configure]
Open [Provider keys](https://platform.speko.dev/settings/provider-keys). Paste each provider's API key. Speko stores secrets encrypted at rest and uses them only for your organization's traffic.
Programmatic equivalent: `PUT /v1/providers` with `{ provider, apiKey }`.
```bash
curl -X PUT https://api.speko.dev/v1/providers \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "provider": "deepgram", "apiKey": "..." }'
```
Remove a key with `DELETE /v1/providers/:provider`. The provider falls back to platform-managed credentials.
## How the router resolves credentials [#how-the-router-resolves-credentials]
For each call, the router picks the highest-ranked provider for the intent (subject to constraints), then asks the secrets store: "do we have an org-scoped key for this provider?"
* **Yes** → use the BYOK key. Failover candidates that lack BYOK keys fall through to platform-managed credentials if those exist.
* **No** → use platform-managed credentials. If none configured, that candidate is skipped.
## Inspect status [#inspect-status]
`GET /v1/providers` returns each known provider plus a `configured` flag and a `source` of `null` (platform-managed) or `"BYOK"`. The same status is shown in [Provider keys](https://platform.speko.dev/settings/provider-keys).
## Supported providers [#supported-providers]
* **STT**: Deepgram, AssemblyAI
* **LLM**: OpenAI
* **TTS**: ElevenLabs, Cartesia
The list grows. `GET /v1/providers/known` is the source of truth.
# Failover (/concepts/failover)
How Speko transparently retries against runner-up providers when the primary fails.
Every routing decision returns a primary `SelectedCandidate` plus an ordered `runnersUp` list — the next-best providers for the same intent. If the primary throws, Speko retries against the next candidate. From the caller's perspective: one request, one response. From the response: `failoverCount` tells you how many providers it tried before one succeeded.
## What counts as a failure [#what-counts-as-a-failure]
A candidate is considered failed if it:
* Throws a network or 5xx error.
* Returns no transcript / no audio / no completion text.
* Errors mid-stream.
Auth errors against BYOK keys also fail — Speko advances to the next candidate without retrying the current one.
## What does not retry [#what-does-not-retry]
* **4xx caller errors** (bad audio, malformed request, missing intent header) — returned to the caller immediately. Failover doesn't help when the input is wrong.
* **Auth errors against the Speko gateway itself** — your API key is invalid; that's not a routing problem.
## All candidates exhausted [#all-candidates-exhausted]
If every candidate in the failover chain fails, Speko returns `ALL_PROVIDERS_FAILED` with a list of the underlying errors. This is rare in practice and usually signals a wide-scale outage or an over-restrictive `constraints.allowedProviders` list.
## Observability [#observability]
* Response body: `failoverCount`, `provider`, `model`, `scoresRunId`.
* Response headers: `X-Speko-Failover-Count`, `X-Speko-Provider`, `X-Speko-Model`, `X-Speko-Scores-Run-Id`.
* Server logs: `[transcribe] failover deepgram/nova-2 → assemblyai/best: `.
If you see consistent non-zero `failoverCount` for a specific intent, check the dashboard provider grid — a BYOK key might be misconfigured, or a provider might be degraded for your region.
# How routing works (/concepts/routing)
Intent + benchmark scores → ranked candidates → live failover. The model behind every Speko call.
Speko continuously benchmarks every supported STT, LLM, TTS, and S2S provider across language and region. Every API call carries a `RoutingIntent`. The router applies hard filters, normalizes the surviving candidates against each other, picks the top-ranked provider, and falls back through runners-up if the primary fails.
## Intent [#intent]
```ts
type RoutingIntent = {
language: string; // BCP-47, e.g. "en-US", "es-MX"
region?: string; // 'global' | 'us-east4' | 'europe-west3' | 'asia-southeast1' (default: 'global')
optimizeFor?: 'balanced' | 'accuracy' | 'latency' | 'cost'; // default: balanced
};
```
`language` is required. `region` selects which streaming-latency measurements the router uses; it defaults to `'global'`. If a provider only published a `global` row (typical for batch endpoints), Speko falls back to that row when no per-region data matches your intent. The TypeScript SDK exposes `region` on `RoutingIntent`; raw HTTP callers send the same value in `X-Speko-Intent`.
`optimizeFor` chooses a weight preset that biases the per-modality composite. Defaults are tuned for production-leaning balance:
| Modality | Quality axis | Latency axis | Cost axis |
| -------- | --------------------------------- | ----------------------- | ---------------------------------------------------------- |
| **STT** | WER (lower is better) | TTFP p50 by region | $/min |
| **TTS** | Round-trip CER | TTFB p50 by region | $/min (chars-billed providers converted via 900 chars/min) |
| **LLM** | Quality score | TTFT p50 | Blended $/1M tokens |
| **S2S** | Task-success % (higher is better) | Tool-call p50 by region | $/min |
`balanced` weights for STT/TTS are `0.5 quality / 0.3 latency / 0.2 cost`. S2S is `0.4 / 0.4 / 0.2` (success and turn-latency carry equal weight). LLM is `0.5 / 0.3 / 0.2`. The other presets shift weight toward their named axis.
## Selection [#selection]
For each modality the selector:
1. Filters to providers with measurements for `(language, region)`. If no region-specific row exists, falls back to `region='global'` rows.
2. Applies hard filters from the active routing policy: e.g. STT drops anything above `max_ttfp_p50_ms = 3000`; all modalities exclude providers with `status='warned'`.
3. Min-max-inverts each axis over the surviving candidate set, so scores are relative to who's still in the running, not to a fixed scale.
4. Computes the weighted composite, sorts, and returns the top candidate plus an ordered `runnersUp` list.
Providers shipping with `status='provisional'` (scaffolded but not measured) and `status='warned'` (measured but flagged unsafe to route to) are visible in the admin UI but excluded from selector output.
Each call returns `scoresRunId` — the benchmark snapshot the decision was based on. Useful for audit and bug repro.
## Failover [#failover]
The runners-up are the next-best providers for your exact intent. If the primary throws, Speko transparently retries the same request against the next candidate. The response includes `failoverCount` (how many providers it tried before one succeeded) and `provider` / `model` (what actually ran).
If every candidate fails, the call returns `ALL_PROVIDERS_FAILED`.
## Constraints [#constraints]
Pin or restrict the candidate pool per modality:
```json
{
"constraints": {
"allowedProviders": {
"stt": ["deepgram"],
"tts": ["cartesia"],
"s2s": ["openai"]
}
}
}
```
Speko still ranks by composite — it just picks the highest-ranking candidate that's in your allow-list. Use this to:
* Pin a provider while debugging.
* Honor compliance constraints (data residency, BAA coverage).
* Cap costs by excluding premium providers.
Allowlists are model-aware. Each entry is either a vendor id (`"deepgram"` — any Deepgram model) or `":"` (`"deepgram:nova-3"` — only Nova-3, no fallback to other Deepgram models within the vendor). Failover stays active across all entries in the layer. Enumerate the valid `id` strings via `GET /v1/providers/known`.
## Preview before you ship [#preview-before-you-ship]
`GET /v1/benchmarks/stack?language=en®ion=us-east4&optimize_for=balanced` returns the current pick per modality plus runners-up, `scoresRunId`, and a `filtered_out[]` list explaining why each excluded candidate was dropped (warned status, missed latency cutoff, missing region data, etc.). No usage is recorded.
## Headers on every response [#headers-on-every-response]
Every `/v1/transcribe`, `/v1/synthesize`, `/v1/complete` response carries:
* `X-Speko-Provider` — provider that handled the request
* `X-Speko-Model` — specific model
* `X-Speko-Failover-Count` — how many providers we tried
* `X-Speko-Scores-Run-Id` — benchmark snapshot id
Log these. They're how you correlate prod behavior with the routing decision.
# Benchmarks and scoring (/concepts/scoring)
How provider scores are computed and refreshed.
Provider rankings come from a continuously-running benchmark suite. Each provider/model is scored per `(language, region)` on a per-modality basis:
| Modality | Quality axis | Latency axis | Cost axis |
| -------- | ------------------------------------- | ----------------------- | ---------------------------------------------------------- |
| **STT** | Word Error Rate (WER) | TTFP p50 by region | $/min (tier-priced) |
| **TTS** | Round-trip Character Error Rate (CER) | TTFB p50 by region | $/min (chars-billed providers converted via 900 chars/min) |
| **LLM** | Quality score | TTFT p50 | Blended $/1M tokens |
| **S2S** | Task-success % | Tool-call p50 by region | $/min |
Composites are computed at query time using **min-max-inverted normalization** over the active candidate set after hard filters. Two consequences worth understanding:
* A provider's score is *relative to the rest of the candidates for your intent*, not a fixed ranking. Adding or removing a candidate (via `constraints.allowedProviders`, status changes, or new ingest) can shift everyone's normalized score.
* Lower-is-better axes (WER, CER, latency, cost) are inverted so higher composite is always better. S2S `task_success_pct` is already higher-is-better and is carried through unchanged.
See [routing](/concepts/routing) for weights per `optimizeFor` preset.
## Provider status [#provider-status]
Every benchmark row carries a `status`:
* **`production`** — measured, in good standing. Eligible for routing.
* **`warned`** — measured, but flagged for a known issue (transcription drift, output instability, etc.). Visible in admin tooling, excluded from routing.
* **`provisional`** — scaffolded but not yet measured. Visible in admin tooling, excluded from routing.
Hard filters drop `warned` and `provisional` candidates before scoring; they never appear in your `runnersUp` list.
## Refresh cadence [#refresh-cadence]
Benchmarks rerun on a schedule and on every benchmark suite update. The active snapshot is identified by `scoresRunId`, returned with every routing decision. Two calls with identical intent within the same snapshot route the same way; across snapshots, a re-ranking can move a different provider into the top spot.
## Health gating [#health-gating]
Hard filters per modality include latency cutoffs (e.g. STT drops candidates above `max_ttfp_p50_ms = 3000`) and the `status != 'warned'` rule above. These run before normalization, so they don't pull other candidates' relative scores around.
## Routing policy [#routing-policy]
Weights and hard filters live in code as defaults (`DEFAULT_STT_POLICY`, `DEFAULT_TTS_POLICY`, `DEFAULT_S2S_POLICY`, `DEFAULT_LLM_POLICY`) and can be overridden per `request_type` via the `routing_policy` table. There can be at most one active policy per modality at a time. Policy changes take effect on the next selector refresh — no re-ingest required.
## Why benchmarks beat a single eval [#why-benchmarks-beat-a-single-eval]
Production traffic is heterogeneous: Spanish healthcare dictation has different leaders than English casual chat, and a provider that wins in `us-east4` can lose badly in `asia-southeast1`. A static "best STT" decision under-serves anything outside the benchmarked happy path. Speko's routing layer means you get the leader *per call*, not per integration choice.
## Inspecting scores [#inspecting-scores]
`GET /v1/benchmarks/stack?language=en®ion=us-east4&optimize_for=accuracy` returns the same ranking the router would use, plus `runnersUp[].score` and `filtered_out[]` (with reasons). Use it to debug "why did Speko pick X for this intent?".
## Custom benchmarks [#custom-benchmarks]
Not in v1. Reach out if your traffic doesn't fit our published benchmark suite.
# Sessions vs one-shot (/concepts/sessions)
When to mint a session and when to call a one-shot endpoint directly.
Speko supports two integration shapes. Pick based on whether your call is real-time interactive or single-turn batch.
## One-shot endpoints [#one-shot-endpoints]
`POST /v1/transcribe`, `POST /v1/synthesize`, `POST /v1/complete`. Each is a single round-trip:
* Caller sends input + intent.
* Speko picks a provider, runs the call (with failover), returns the result.
Use for:
* Batch transcription of recorded audio.
* Server-side TTS for notifications, IVR prompts, exports.
* LLM completions in a non-voice flow.
No state is held between calls. There is nothing to clean up.
## Voice sessions [#voice-sessions]
`POST /v1/sessions`. Returns a `transportToken` plus a `transportUrl` and room name. The browser uses [`@spekoai/client`](/client/overview) to join the media transport; Speko dispatches an agent worker into the same session to run the STT → LLM → TTS pipeline in real time over WebRTC.
Use for:
* Live voice conversations between an end-user and an agent.
* Anything that needs barge-in, partial transcripts, or sub-second latency.
The session has a TTL (`ttlSeconds`, default 900s, max 86400s). The agent worker leaves when the room empties or the token expires. The `voiceSession` row is retained for usage and audit.
## Phone sessions [#phone-sessions]
`POST /v1/sessions/phone` creates the same kind of voice session, then dials a PSTN destination over LiveKit SIP. Inbound calls follow the reverse path: a registered phone number receives the carrier webhook, Speko creates the voice session, hydrates the linked agent or dispatch metadata template, and bridges the caller into the room.
Use phone sessions for:
* Outbound appointment reminders, sales calls, and scheduled callbacks.
* Inbound receptionists and support lines.
* Calls that need carrier lifecycle events, forwarded-number metadata, post-call reports, recordings, or live transfers.
See [Build a phone agent](/guides/phone-agents) for the end-to-end phone flow.
## Choosing [#choosing]
| Need | Use |
| ------------------------------------------------- | ------------------------------------------------------- |
| Transcribe a file, return text | `/v1/transcribe` |
| Generate audio for a notification | `/v1/synthesize` |
| Single LLM reply, no voice | `/v1/complete` |
| Real-time voice agent in a browser | `/v1/sessions` + `@spekoai/client` |
| Outbound PSTN voice call | `/v1/sessions/phone` or `speko.voice.dial()` |
| Inbound PSTN receptionist | `/v1/phone-numbers` linked to an agent |
| Inspect reports, events, recordings, or transfers | `/v1/calls/{id}` |
| Real-time voice in a self-hosted framework worker | `@spekoai/adapter-livekit` directly, when using LiveKit |
## What sessions don't do [#what-sessions-dont-do]
* They aren't a chat history store. The dispatched agent's `systemPrompt` and per-turn context live in your worker.
* They don't proxy audio through the REST API. Speko is in the *control* path (mint token, dispatch worker). Audio flows through the session's media transport.
* They don't bill differently from one-shot calls. Usage is recorded per underlying STT/LLM/TTS call.
## Recording [#recording]
Every voice session is recorded by default. The agent worker captures both speakers — caller and agent — into a single mixed-mono Opus file, persisted to Google Cloud Storage at the end of the call. There is no separate "enable recording" call; producing a session produces a recording.
What gets captured:
* **Mixed mono Opus.** Both sides of the conversation in one file, \~24 kbps. Stereo / per-speaker tracks are not produced — assume one combined audio stream per session.
* **The full call.** Recording starts when the first participant joins the room and ends when the room empties or the session token expires.
* **Audio only.** Tool-call payloads and transcripts are not in the audio file; those live on the session row and the per-turn entries.
How to fetch one:
```bash
curl -L \
-H "Authorization: Bearer $SPEKO_API_KEY" \
https://api.speko.dev/v1/sessions/$SESSION_ID/recording \
--output session.opus
```
The endpoint 302-redirects to a short-lived (5 minute) signed GCS URL — pass `-L` so curl follows it. The signed URL is single-use within its TTL window; refetch the endpoint to get a fresh one rather than caching the URL itself.
### The status field [#the-status-field]
Each session entry carries a `recordingStatus` that walks through:
* `pending` — the call ended; the agent worker is assembling the file.
* `uploading` — the file is being pushed to GCS.
* `ready` — fully persisted and downloadable. `recordingObjectPath` and `recordingDurationMs` are populated.
* `failed` — the upload errored. The recording is unrecoverable; the session itself is unaffected.
* `suppressed` — the organization has `recordingEnabled` set to false, so no file was ever produced. This is the terminal state for opted-out orgs; it's not a transient one.
`null` on `recordingStatus` only appears on legacy session entries that predate this feature. Treat it as "unknown" and don't expect a download.
### Retention [#retention]
Recordings are kept for **30 days** after the session ends, then deleted automatically by a GCS lifecycle policy. There is no in-product way to extend retention — if you need long-term storage, follow the redirect, download the file, and store it in your own bucket. If you need shorter retention, see the per-org opt-out below.
### Per-org opt-out [#per-org-opt-out]
Recording is governed at the organization level by `organization.recordingEnabled` (default `true`). Flipping it to `false`:
* Stops new sessions from producing files. Their `recordingStatus` becomes `suppressed` and `GET /v1/sessions/{id}/recording` returns 404.
* Does **not** retroactively delete prior recordings — those expire on the normal 30-day timer.
The flag is exposed in the dashboard under the **Record voice sessions** toggle on the Settings page. Per-call opt-out (skipping recording for a single session even when the org default is on) is out of scope for this round.
### HIPAA mode [#hipaa-mode]
Organizations on a HIPAA-mode plan are forced into `recordingEnabled: false` regardless of the dashboard toggle, and the toggle is locked. This is the current bridge until an end-to-end customer-managed-key path lands; see the compliance issue tracker for the full story.
### API endpoint [#api-endpoint]
`GET /v1/sessions/{id}/recording` is the only supported way to retrieve a recording. It authenticates with the same bearer key as every other endpoint and 302-redirects to a 5-minute signed URL on success, or 404 on any of: unknown session, recording not yet ready, recording failed, recording suppressed. Inspect the parent session entry's `recordingStatus` to disambiguate. Never construct GCS URLs from `recordingObjectPath` directly — those URLs are not publicly addressable, and the field exists only as an internal handle.
## Transcript [#transcript]
Every finalized STT and LLM turn during a cascade session is persisted server-side. The dashboard's session detail page surfaces them under a Transcript card; the API equivalent is `GET /v1/sessions/{id}/transcript`, which returns turns sorted by index.
The worker batches and debounces (\~200ms) finalized turns and POSTs them to `/v1/sessions/{id}/turns`. The ingest endpoint is idempotent on `(session_id, index)` so retries on transient errors are safe.
Interim STT partials are not persisted — only finalized turns make it into the transcript.
S2S sessions don't currently produce transcripts (the audio path bypasses our STT/LLM components). That's a follow-up.
# One-shot APIs (/guides/one-shot)
POST /v1/transcribe, /v1/synthesize, /v1/complete — single-turn calls without sessions.
For batch transcription, server-side TTS, and non-voice LLM completions you don't need a real-time session — just call the one-shot endpoints directly. Each is a single round-trip with built-in routing and failover.
## Auth [#auth]
Every one-shot call needs a bearer API key. Mint one at [API keys](https://platform.speko.dev/api-keys).
```http
Authorization: Bearer sk_live_...
```
## Transcribe [#transcribe]
```bash
curl -X POST https://api.speko.dev/v1/transcribe \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: audio/wav" \
-H "x-speko-intent: {\"language\":\"en-US\"}" \
--data-binary @call.wav
```
Response:
```json
{
"text": "...",
"provider": "deepgram",
"model": "nova-2",
"confidence": 0.94,
"failoverCount": 0,
"scoresRunId": "..."
}
```
Notes:
* Audio body is binary. Wrap PCM/MP3/WAV/etc. in the request body — no base64.
* Intent goes in the `x-speko-intent` header (JSON), not the body. Constraints in `x-speko-constraints`.
## Synthesize [#synthesize]
```bash
curl -X POST https://api.speko.dev/v1/synthesize \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, world.",
"intent": { "language": "en-US" },
"voice": null,
"speed": 1
}' \
--output speech.bin
```
Response body is the audio. Content-Type indicates the format (e.g. `audio/pcm;rate=24000` for Cartesia, `audio/mpeg` for ElevenLabs). Routing headers (`X-Speko-Provider`, etc.) tell you which provider ran.
## Complete [#complete]
```bash
curl -X POST https://api.speko.dev/v1/complete \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are concise." },
{ "role": "user", "content": "Hi!" }
],
"intent": { "language": "en" }
}'
```
Response:
```json
{
"text": "Hello!",
"provider": "openai",
"model": "gpt-4o-mini",
"usage": { "promptTokens": 14, "completionTokens": 4 },
"failoverCount": 0,
"scoresRunId": "..."
}
```
## With an SDK [#with-an-sdk]
Both SDKs wrap all three endpoints with matching shapes. See [`@spekoai/sdk`](/sdk/overview) and [`spekoai` (Python)](/sdk-python/overview).
```ts TypeScript
import { Speko } from '@spekoai/sdk';
const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
const t = await speko.transcribe(buf, { language: 'en-US' });
const a = await speko.synthesize('Hello', { language: 'en' });
const c = await speko.complete({
messages: [{ role: 'user', content: 'Hi!' }],
intent: { language: 'en' },
});
```
```python Python
from spekoai import Speko
speko = Speko(api_key=os.environ["SPEKO_API_KEY"])
t = speko.transcribe(buf, language="en-US")
a = speko.synthesize("Hello", language="en")
c = speko.complete(
messages=[{"role": "user", "content": "Hi!"}],
intent={"language": "en"},
)
```
## When not to use one-shot [#when-not-to-use-one-shot]
If you need real-time voice with sub-second latency, barge-in, and partial transcripts — use [sessions](/concepts/sessions). One-shot endpoints are for batch and server-internal flows.
# Build a phone agent (/guides/phone-agents)
Inbound and outbound PSTN calls, phone-number routing, lifecycle webhooks, reports, and transfers.
Speko can run the same voice agent over browser sessions and PSTN calls. Phone calls use the platform-hosted LiveKit SIP path: Speko creates or receives the call leg, dispatches the agent worker, records the room, persists transcripts and events, and finalizes a call report after hangup.
Use this guide when you want a receptionist, appointment setter, callback workflow, or support line that can place calls, answer calls, and transfer callers to a human.
## Phone surfaces [#phone-surfaces]
| Need | API | SDK |
| ---------------------------- | --------------------------------------------------------- | ------------------------------------------------------------ |
| Place an outbound phone call | `POST /v1/sessions/phone` | `speko.voice.dial()` |
| Buy or import numbers | `/v1/phone-numbers/*` | `speko.phoneNumbers` |
| Route inbound calls | `agentId` or `dispatchMetadataTemplate` on a phone number | `speko.phoneNumbers.update()` |
| Inspect calls and reports | `/v1/calls/{id}` | `speko.calls.get()` |
| Read lifecycle events | `/v1/calls/{id}/events` | `speko.calls.events()` |
| Transfer live calls | `/v1/calls/{id}/transfers/*` | `speko.calls.blindTransfer()` / `speko.calls.warmTransfer()` |
## 1. Create an agent [#1-create-an-agent]
Phone calls work best with a persisted agent because the phone-number row can hydrate its prompt, routing intent, provider preferences, speech settings, tools, and lifecycle webhooks every time a call starts.
```bash
curl -X POST https://api.speko.dev/v1/agents \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"name": "Front desk",
"systemPrompt": "You are the front-desk receptionist. Greet callers, collect their name and reason for calling, answer common questions, and transfer urgent calls.",
"intent": { "language": "en-US", "optimizeFor": "latency" },
"webhooks": {
"preCall": { "url": "https://example.com/speko/pre-call" },
"status": { "url": "https://example.com/speko/status" },
"postCall": {
"url": "https://example.com/speko/post-call",
"extractionFields": [
{ "name": "caller_intent", "type": "enum", "description": "Why the caller reached out", "options": ["booking", "support", "sales"] },
{ "name": "party_size", "type": "number", "description": "Number of people in the party" },
{ "name": "wants_callback", "type": "boolean", "description": "Did the caller ask to be called back?" }
]
}
}
}'
```
Lifecycle webhooks are Standard Webhooks-signed with the organization webhook secret. The legacy per-webhook `secret` field is still accepted for older clients, but new integrations should verify the org-level secret.
## 2. Add a phone number [#2-add-a-phone-number]
You can use a platform-managed number or register a customer-owned SIP trunk number.
Managed number search and purchase is currently the US-number path. It requires phone-number business verification and sufficient credits:
```ts
const options = await speko.phoneNumbers.searchAvailable({ areaCode: '415' });
const number = await speko.phoneNumbers.create({
e164: options[0]!.e164,
direction: 'both',
agentId: 'agent_123',
label: 'Main line',
});
```
For numbers you already own, import the SIP trunk instead:
```ts
await speko.phoneNumbers.importSipTrunk({
e164: '+442071234567',
sipConnectionInstallationId: '00000000-0000-4000-8000-000000000010',
direction: 'both',
agentId: 'agent_123',
label: 'London front desk',
});
```
`direction` controls whether the number can be used for inbound calls, outbound calls, or both. Inbound calls require either an `agentId` or a `dispatchMetadataTemplate`.
## 3. Place outbound calls [#3-place-outbound-calls]
Use `POST /v1/sessions/phone` or `speko.voice.dial()` for outbound PSTN calls. The response returns the Speko session id, room name, resolved caller ID, and SIP participant handle.
```ts
const call = await speko.voice.dial({
to: '+12015551234',
from: '+12015550199',
agentId: 'agent_123',
firstMessage: 'Hi, this is Ava from Acme. Is now still a good time?',
telephony: {
region: 'us-east',
amd: { mode: 'agent', timeoutSeconds: 8 },
},
metadata: {
campaignId: 'renewal-q2',
leadId: 'lead_456',
},
});
console.log(call.sessionId, call.status);
```
`agentId` is required unless you pass an ad hoc `intent`. Per-call fields such as `systemPrompt`, `firstMessage`, `voice`, `llm`, `ttsOptions`, `sttOptions`, `constraints`, and `metadata` override or extend the agent defaults for that call.
## 4. Receive inbound calls [#4-receive-inbound-calls]
When a call arrives on a registered number, Speko matches the dialed E.164 number, checks `direction`, creates a voice session, hydrates the linked agent, dispatches the worker, and bridges the caller into the LiveKit room.
For forwarded calls, Speko attempts to preserve the original forwarding source. It looks at Telnyx payload fields such as `forwarded_from`, `original_to`, and `redirecting_number`, plus SIP headers such as `Diversion` and `History-Info`. The normalized value is exposed as:
* `forwardedFromNumber` in session metadata.
* `forwarded_from_number` in pre-call webhook payloads.
* call events and reports through the session metadata.
Use `dispatchMetadataTemplate` when you need static metadata or a legacy template alongside the linked agent:
```ts
await speko.phoneNumbers.update('pn_123', {
agentId: 'agent_123',
dispatchMetadataTemplate: {
tenant: 'acme',
line: 'front_desk',
caller: '{{callerNumber}}',
dialedNumber: '{{dialedNumber}}',
forwardedFromNumber: '{{forwardedFromNumber}}',
},
});
```
The agent fields are canonical. If `agentId` and `dispatchMetadataTemplate` set the same pipeline key, the persisted agent wins and the template fills the gaps.
## 5. Customize calls before answer [#5-customize-calls-before-answer]
Configure an agent `preCall` webhook when your application needs to look up the caller, choose a greeting, inject account context, or override the call pipeline before the worker starts speaking.
Speko sends:
```json
{
"type": "call.pre_call",
"call_id": "session_123",
"session_id": "session_123",
"organization_id": "org_123",
"direction": "inbound",
"to": "+12015550199",
"from": "+12015551234",
"dialed_number": "+12015550199",
"forwarded_from_number": null,
"phone_number_id": "pn_123",
"call_control_id": "sip_participant_123",
"pipeline_config": {},
"metadata": {}
}
```
Return any supported pipeline overrides at the top level or under `pipelineConfig` / `pipeline_config`:
```json
{
"firstMessage": "Hi Maya, thanks for calling Acme.",
"systemPrompt": "The caller is Maya Chen, a priority customer. Keep responses brief.",
"metadata": {
"customerId": "cus_123",
"plan": "enterprise"
}
}
```
Supported pre-call override keys include `intent`, `constraints`, `voice`, `systemPrompt`, `firstMessage`, `llm`, `ttsOptions`, `sttOptions`, `backgroundAudio`, `tools`, pronunciation and text replacement rules, and idle reprompts.
## 6. Track lifecycle and reports [#6-track-lifecycle-and-reports]
Speko records durable lifecycle events for LiveKit, Telnyx, Speko status updates, SIP cause codes, and transfer attempts.
```ts
const call = await speko.calls.get('session_123');
const { events } = await speko.calls.events('session_123');
const report = await speko.calls.report('session_123');
```
Agent `status` webhooks receive call progress and failure information:
```json
{
"type": "call.status",
"call_id": "session_123",
"event_type": "call.hangup",
"provider": "telnyx",
"status": "ended",
"failure_cause": null,
"sip_status_code": 200,
"sip_status": "OK",
"metadata": {}
}
```
After hangup, Speko finalizes a report with transcript entries, summary, outcome, structured data, cost breakdown, artifacts, metadata, and any scheduled callback created by analysis. The agent `postCall` webhook receives the same report payload with `type: "call.report"`. Failed post-call deliveries are retried, and you can force rerun or retry with `speko.calls.finalizeReport(callId, { forceAnalysis: true, retryWebhook: true })`.
### Custom data extraction [#custom-data-extraction]
Add `extractionFields` to the `postCall` webhook to have Speko pull structured values out of every call. Each field has a `name`, a `type` (`string`, `number`, `boolean`, or `enum`), and a `description` that tells the analysis model what to extract — `enum` fields also take an `options` list. After the call, the model fills each field from the transcript, coerced to its declared type (or `null` when the call gives no basis for it), and the values arrive as a top-level `custom_data` object on the `call.report` payload, keyed by field name:
```json
{
"type": "call.report",
"summary": "Jane booked a table for four.",
"outcome": "scheduled",
"structured_data": { "...": "..." },
"custom_data": {
"caller_intent": "booking",
"party_size": 4,
"wants_callback": true
}
}
```
`custom_data` always carries one key per configured field, so the payload schema stays stable. You can also manage these fields from the agent's **Webhooks → After call ends** settings in the dashboard.
## 7. Transfer callers [#7-transfer-callers]
Use the Calls API for operator-driven transfers, or add the built-in `transfer_call` tool to let the agent transfer based on conversation context.
```ts
await speko.calls.blindTransfer('session_123', {
to: '+12015554321',
ringingTimeout: 25,
metadata: { reason: 'billing escalation' },
});
const transfer = await speko.calls.warmTransfer('session_123', {
from: '+12015550199',
destinations: [
{ to: '+12015551234', label: 'Front desk' },
{ to: '+12015554321', label: 'Overflow' },
],
screeningPrompt: 'Confirm the recipient can help before bridging.',
fallback: { strategy: 'take_message' },
voicemailDetection: { mode: 'agent', timeoutSeconds: 10 },
});
```
Warm transfer starts a consultation leg first. Complete the transfer when the screened recipient accepts, or cancel with `tryNext: true` to continue through the destination list.
## Next [#next]
Provision, import, update, and release phone numbers.
Inspect reports, events, recordings, and live transfers.
Place outbound phone calls from TypeScript.
Full request and response schemas for phone calls.
# Real-time browser conversation (/guides/realtime-conversation)
Wire @spekoai/client into a web app — mint a session, join the transport, stream voice both ways.
`@spekoai/client` connects a browser tab to a Speko voice session over WebRTC. Your server mints a session (`POST /v1/sessions`) and the browser joins with the returned transport credentials. Keep `SPEKO_API_KEY` on your server and return only short-lived session credentials to the browser.
## Install [#install]
```bash
npm install @spekoai/client
```
## 1. Mint a session on your server [#1-mint-a-session-on-your-server]
```ts server.ts
app.post('/api/conversations', async (req, res) => {
const session = await fetch('https://api.speko.dev/v1/sessions', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.SPEKO_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
mode: 'cascade',
intent: { language: 'en-US' },
systemPrompt: 'You are a helpful voice assistant.',
voice: undefined, // let routing pick
ttlSeconds: 900, // default
}),
}).then((r) => r.json());
res.json({
transportToken: session.transportToken,
transportUrl: session.transportUrl,
});
});
```
Never expose your `SPEKO_API_KEY` to the browser. The session token is short-lived and scoped to one room.
## 2. Join from the browser [#2-join-from-the-browser]
```tsx VoicePanel.tsx
import { useEffect, useRef, useState } from 'react';
import { VoiceConversation } from '@spekoai/client';
export function VoicePanel() {
const convRef = useRef(null);
const [status, setStatus] = useState('idle');
const [transcript, setTranscript] = useState([]);
async function start() {
const { transportToken, transportUrl } = await fetch('/api/conversations', {
method: 'POST',
}).then((r) => r.json());
const conv = await VoiceConversation.create({
transportToken,
transportUrl,
onConnect: () => setStatus('connected'),
onDisconnect: () => setStatus('idle'),
onMessage: ({ source, text, isFinal }) => {
if (isFinal) setTranscript((t) => [...t, `${source}: ${text}`]);
},
onStatusChange: (s) => setStatus(s),
onError: (err) => console.error(err),
});
convRef.current = conv;
}
async function stop() {
await convRef.current?.endSession();
convRef.current = null;
}
useEffect(() => () => { void convRef.current?.endSession(); }, []);
return (
Start
Stop
Status: {status}
{transcript.map((t, i) => {t} )}
);
}
```
That's the whole loop: mint → connect → talk → end.
## What you can do mid-conversation [#what-you-can-do-mid-conversation]
```ts
await conv.setMicMuted(true);
conv.setVolume(0.8);
conv.sendUserMessage('hello'); // text input as if spoken
conv.sendContextualUpdate('user navigated to checkout');
```
`sendContextualUpdate` injects context the agent will see on its next turn without speaking it aloud — useful for app-state changes the agent should know about.
## Mic / device control [#mic--device-control]
`@spekoai/client` requests the mic with sensible defaults (echo cancel, noise suppression, AGC). Override per session:
```ts
await VoiceConversation.create({
transportToken,
transportUrl,
audioConstraints: {
echoCancellation: false,
noiseSuppression: false,
autoGainControl: false,
},
});
```
## What the SDK does not hide [#what-the-sdk-does-not-hide]
* **Long-lived API keys.** Keep `SPEKO_API_KEY` on your server. Browser code should only receive short-lived session credentials.
* **Reconnect / retry.** A failed `connect()` throws `SpekoClientError`. Your UX decides whether to retry.
* **Tool calls / MCP / VAD streaming.** Deferred.
## Next [#next]
Full `@spekoai/client` reference.
Worker side of the same architecture.
# Tool calling (/guides/tool-calling)
Give a Speko voice agent the ability to invoke webhook tools mid-call. Register once in the dashboard, fire from any voice session.
Tool calling lets the LLM driving your voice session take action: query a database, schedule a visit, transfer to a human. The model decides when to invoke a tool from your prompt; Speko POSTs a [Standard Webhooks](https://www.standardwebhooks.com)-signed request to your endpoint, folds the JSON response back into the model's next turn, and the agent verbalizes the result.
This guide walks through registering a tool, hooking it into a Speko voice session, and confirming it fires.
## Architecture [#architecture]
```
Voice session Speko proxy Your endpoint
───────────── ─────────── ─────────────
LLM emits tool call ─→ /v1/complete loop ─→ POST /your/webhook
(signed body)
LLM verbalizes result ←─ response folded back ←─ 200 + JSON
```
Three pieces meet:
1. **Your endpoint** — a public HTTPS URL that receives the tool call and returns JSON.
2. **The Speko dashboard** — where you register the tool (name, description, JSON Schema parameters, your endpoint URL). Speko stores an HMAC signing secret you save once.
3. **A Speko voice session** — the worker fetches your registered tools at session start, exposes them to the LLM, and routes invocations through the executor.
## 1. Build your endpoint [#1-build-your-endpoint]
The executor POSTs the LLM-generated arguments as JSON. Whatever you return becomes the model's next observation, so keep responses small and specific.
```ts src/server.ts
import { Hono } from 'hono';
const PETS: Record = {
luna: { name: 'Luna', species: 'corgi', age: 3, status: 'available' },
max: { name: 'Max', species: 'tabby cat', age: 5, status: 'available' },
};
const app = new Hono();
app.post('/lookup', async (c) => {
const { name } = (await c.req.json()) as { name?: string };
const pet =
PETS[
String(name ?? '')
.toLowerCase()
.trim()
];
if (!pet) return c.json({ error: 'Pet not found' }, 404);
return c.json(pet);
});
export default { port: Number(process.env.PORT ?? 8080), fetch: app.fetch };
```
Deploy this anywhere with a public HTTPS URL — Cloud Run, Fly.io, Render, Vercel functions.
### Verifying the signature [#verifying-the-signature]
Production endpoints MUST verify the [Standard Webhooks](https://www.standardwebhooks.com) signature on every request. Speko sends three headers:
| Header | Meaning |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `webhook-id` | Idempotency key for this delivery. Skip duplicates. |
| `webhook-timestamp` | Unix seconds when Speko signed the body. Reject anything older than \~5 minutes to prevent replay. |
| `webhook-signature` | `v1,`. Multiple comma-separated signatures may appear during rotation; accept if any one matches. |
Use the [`standardwebhooks`](https://www.npmjs.com/package/standardwebhooks) package — constant-time comparison and clock-skew tolerance are tricky to roll yourself.
```ts
import { Webhook } from 'standardwebhooks';
const wh = new Webhook(process.env.LOOKUP_PET_SIGNING_SECRET!);
app.post('/lookup', async (c) => {
const raw = await c.req.text();
try {
wh.verify(raw, Object.fromEntries(c.req.raw.headers));
} catch {
return c.text('signature mismatch', 401);
}
const { name } = JSON.parse(raw) as { name?: string };
// …
});
```
## 2. Register the tool [#2-register-the-tool]
### Via the dashboard [#via-the-dashboard]
Open [Tools](https://platform.speko.dev/tools) in the dashboard, click **Add tool**, fill in:
* **Name** — `snake_case`, ≤ 64 chars (e.g. `lookup_pet`). The model sees this; pick something it'll match against the user's intent.
* **Description** — tell the model *when* to call this. Be explicit ("ALWAYS call this when the user asks about a specific pet by name").
* **Parameters** — a JSON Schema. Strict typing works; vague typing leads to the model passing garbage args.
* **Webhook URL** — your public HTTPS endpoint from step 1. Speko rejects HTTP, private/loopback hosts, and known cloud-metadata IPs at registration time.
On save, the dashboard shows the **signing secret once**. Copy it into your secrets manager — rotation requires creating a new tool (UUID-stable secret keys are coming).
### Via the API [#via-the-api]
```bash
curl -X POST "https://api.speko.dev/v1/agents/$SPEKO_AGENT_ID/tools" \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"name": "lookup_pet",
"description": "Look up a pet by name. ALWAYS call this when the user asks about a specific pet.",
"parameters": {
"type": "object",
"required": ["name"],
"properties": {
"name": { "type": "string", "description": "First name of the pet." }
}
},
"source": {
"kind": "webhook",
"url": "https://your-endpoint.example.com/lookup",
"secret": "<32-char hex you supply — Speko stores it encrypted>"
}
}'
```
The `secret` you POST is what Speko uses to sign webhook deliveries. The server stores an encrypted copy and never echoes it back, so keep your local copy.
### Async webhook tools [#async-webhook-tools]
Webhook tools default to `responseMode: "sync"`: Speko waits for your endpoint response and feeds the JSON body into the next model turn. For work that should not block the conversation, set `responseMode: "async"` and provide an `asyncAck`:
```json
{
"source": {
"kind": "webhook",
"url": "https://your-endpoint.example.com/create-ticket",
"secret": "<32-char hex you supply>",
"responseMode": "async",
"asyncAck": "I started that request and will continue helping while it runs."
}
}
```
In async mode, Speko dispatches the signed webhook in the background and immediately returns the acknowledgement text to the model. Use this for ticket creation, CRM updates, notifications, and other side effects where the caller does not need the result before the next assistant turn.
### Use the actual agent id [#use-the-actual-agent-id]
Tools are scoped to one persisted agent. Use the agent id returned by `POST /v1/agents` or shown on the dashboard agent page. The unique key is `(organization, agentId, toolName)`, so two agents can use the same tool name without sharing webhook config.
## 3. Wire the worker [#3-wire-the-worker]
If you run a LiveKit Agents worker, the adapter loads your registered tools at session start and merges them with anything the framework provides at runtime. Use `createSpekoComponents` with the registered-tools options:
```ts agent.ts
import { defineAgent, voice } from '@livekit/agents';
import * as silero from '@livekit/agents-plugin-silero';
import { Speko } from '@spekoai/sdk';
import { createSpekoComponents } from '@spekoai/adapter-livekit';
const speko = new Speko({
apiKey: process.env.SPEKO_API_KEY!,
baseUrl: process.env.SPEKO_BASE_URL,
});
export default defineAgent({
prewarm: async (proc) => {
proc.userData.vad = await silero.VAD.load();
},
entry: async (ctx) => {
const vad = ctx.proc.userData.vad as silero.VAD;
const { stt, llm, tts } = createSpekoComponents({
speko,
vad,
intent: { language: 'en-US', optimizeFor: 'latency' },
// Enable the registered-tools loader. The adapter calls
// speko.agents.tools.listChatTools(agentId) once per session — reusing
// the Speko client above for auth and base URL — and merges the result
// with whatever LiveKit's ToolContext provides. Registered tools win on
// name collision.
agentId: process.env.SPEKO_AGENT_ID!,
onRegisteredToolsError: (err) =>
console.error('SpekoWorker: tools fetch failed', err),
});
const session = new voice.AgentSession({ vad, stt, llm, tts });
await session.start({
agent: new voice.Agent({
instructions:
'You are a brief, friendly assistant. ' +
'When the user asks about a specific pet by name, ' +
'IMMEDIATELY call lookup_pet — never make up information.',
}),
room: ctx.room,
});
await ctx.connect();
},
});
```
Without `agentId`, the loader stays disabled and the agent only sees runtime tools — useful when you want to opt in selectively.
Outside a LiveKit worker, load the same tools yourself with `speko.agents.tools.listChatTools(agentId)` and pass them to `speko.complete({ tools })`. It returns every source kind (`inline`, `webhook`, `builtin`, `integration`) already in the `ChatTool[]` shape `/v1/complete` accepts.
## 4. Run a call [#4-run-a-call]
The simplest client is a browser using [`@spekoai/client`](/client/overview):
```ts
import { VoiceConversation } from '@spekoai/client';
const res = await fetch('/api/session', { method: 'POST' });
const { transportToken, transportUrl } = await res.json();
const conv = await VoiceConversation.create({
transportToken,
transportUrl,
onModeChange: (mode) => console.log(mode), // 'listening' | 'speaking'
});
```
Your `/api/session` server route mints browser-safe transport credentials via Speko:
```ts
const r = await fetch(process.env.SPEKO_BASE_URL + '/v1/sessions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: 'Bearer ' + process.env.SPEKO_API_KEY,
},
body: JSON.stringify({
mode: 'cascade',
agentId: process.env.SPEKO_AGENT_ID!,
ttlSeconds: 900,
}),
});
const { transportToken, transportUrl } = await r.json();
```
## What gets sent over the wire [#what-gets-sent-over-the-wire]
When the model invokes a registered tool, Speko's executor signs the request with your secret and POSTs to your URL:
```http
POST https://your-endpoint.example.com/lookup
content-type: application/json
webhook-id: msg_2KQfP3QH8Gv7B
webhook-timestamp: 1735603214
webhook-signature: v1,F7ZxQk8j3p6m2N9...
{
"name": "Luna"
}
```
Your response body is what the model sees as the tool result. Errors propagate too — if your endpoint returns 4xx/5xx, the executor surfaces the error so the agent can apologize or retry instead of silently swallowing it.
## Debugging [#debugging]
Common failure modes:
* **Tool never invoked.** The model didn't decide to call it. Tighten the description (be explicit about *when* to call), or set `toolChoice: "required"` in your call options to force one.
* **Webhook never lands.** Check the worker logs for the executor span. Common: 403 from your endpoint (signature mismatch), 5xx (your code threw), or timeout (your endpoint is too slow — budget under 4 seconds).
* **Agent says "couldn't find" instead of the real result.** Your endpoint returned 4xx. Either the query genuinely missed, or the model passed empty/wrong args. During development, have your endpoint echo back the body it received so you can spot the latter.
* **Two voices overlap in the room.** A second agent dispatched into the same room without ending the previous session. Always call `endSession()` on your `VoiceConversation` (or disconnect the participant) before opening a new conversation.
## Beyond webhooks [#beyond-webhooks]
Webhook tools are the most common, but a registered tool's `source` can also be:
* **`builtin`** — Speko-managed helpers you opt into without running your own endpoint. Current built-ins include `search_knowledge_base`, `transfer_call`, and `end_call`. `transfer_call` supports warm or blind transfers from the active phone session when configured with destinations.
* **`integration`** — an action from an org-installed Speko app (Google Calendar, Slack, …), resolved and executed server-side.
* **`inline`** — your own worker runs the tool; Speko just ships the schema to the model and returns the call to you.
All four kinds come back from `speko.agents.tools.listChatTools(agentId)` ready to hand to `speko.complete`.
## What's next [#whats-next]
* Streaming tool results for long-running queries.
Track progress on the [public roadmap](https://github.com/SpekoAI/platform).
# Build a voice agent (/guides/voice-agent)
Real-time STT → LLM → TTS pipeline using @spekoai/adapter-livekit on a LiveKit Agents worker.
This guide walks through standing up a LiveKit Agents worker that uses Speko for every modality. The worker registers with LiveKit Cloud, joins rooms on demand, and runs a streaming voice pipeline backed by Speko's routing.
If you only want browser-side conversation logic and don't run your own worker, see the [hosted session flow](/guides/realtime-conversation) instead.
## Architecture [#architecture]
```
Browser ⟷ LiveKit room ⟷ your agent worker
│
└─→ @spekoai/sdk → Speko gateway → providers
```
Three processes meet in a LiveKit room:
1. **Browser** uses [`@spekoai/client`](/client/overview) to join with a session token your server mints.
2. **Your API server** mints the token (`POST /v1/sessions` or your own `livekit-server-sdk` flow) and dispatches the agent worker.
3. **Your agent worker** (this guide) runs `@livekit/agents` with Speko-backed STT/LLM/TTS.
Audio flows browser ↔ LiveKit ↔ worker. Speko sits in the control path, not the audio path.
## Install [#install]
```bash
npm install @spekoai/sdk @spekoai/adapter-livekit \
@livekit/agents @livekit/agents-plugin-silero @livekit/rtc-node
```
`@livekit/agents` and `@livekit/rtc-node` are peers — pin the versions you actually run.
## Worker entry [#worker-entry]
```ts agent.ts
import {
type JobContext,
type JobProcess,
ServerOptions,
cli,
defineAgent,
voice,
} from '@livekit/agents';
import * as silero from '@livekit/agents-plugin-silero';
import { Speko } from '@spekoai/sdk';
import { createSpekoComponents } from '@spekoai/adapter-livekit';
import { fileURLToPath } from 'node:url';
const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
export default defineAgent({
prewarm: async (proc: JobProcess) => {
proc.userData.vad = await silero.VAD.load();
},
entry: async (ctx: JobContext) => {
const vad = ctx.proc.userData.vad as silero.VAD;
const { stt, llm, tts } = createSpekoComponents({
speko,
vad,
intent: { language: 'en-US', optimizeFor: 'balanced' },
// optional: pin providers
// constraints: { allowedProviders: { tts: ['cartesia'] } },
});
const session = new voice.AgentSession({ vad, stt, llm, tts });
await session.start({
agent: new voice.Agent({
instructions: 'You are a helpful voice assistant. Be concise.',
}),
room: ctx.room,
});
await ctx.connect();
session.generateReply({ instructions: 'Greet the user and offer your assistance.' });
},
});
cli.runApp(
new ServerOptions({
agent: fileURLToPath(import.meta.url),
agentName: 'speko-demo',
}),
);
```
Run it with `node agent.js` (after build) or your `tsx` setup of choice. The worker registers with LiveKit Cloud under `agentName` and waits for dispatches.
## Per-session config from dispatch metadata [#per-session-config-from-dispatch-metadata]
When your server creates a session, the dispatcher passes JSON metadata to the worker. Read it in `entry` to build pipeline-per-session:
```ts
import { z } from 'zod';
const dispatchSchema = z.object({
sessionId: z.string(),
intent: z.object({
language: z.string(),
optimizeFor: z.enum(['balanced', 'accuracy', 'latency', 'cost']).optional(),
}),
constraints: z.any().optional(),
voice: z.string().optional(),
systemPrompt: z.string().optional(),
});
const meta = dispatchSchema.parse(JSON.parse(ctx.job.metadata ?? '{}'));
const { stt, llm, tts } = createSpekoComponents({
speko,
vad,
intent: meta.intent,
constraints: meta.constraints,
voice: meta.voice,
});
```
## Limitations of v1 [#limitations-of-v1]
* **STT upload is utterance-bounded.** `/v1/transcribe` streams transcript events back, but the LiveKit adapter still uploads one VAD-segmented WAV per utterance.
* **TTS is sentence-bounded in LiveKit.** `/v1/synthesize` streams audio bytes while the adapter calls it once per tokenized sentence.
* **Tool calls are supported.** Inline tools return to the LiveKit runtime; registered webhook, builtin, and integration tools run server-side through `/v1/complete`.
* **TTS format constraints.** Cartesia (PCM) and WAV TTS work. ElevenLabs MP3 currently throws — pin a PCM-capable provider via `constraints.allowedProviders.tts` or rely on the router's score-driven default.
* **STT input.** Mono PCM16 frames; multi-channel throws.
See [`@spekoai/adapter-livekit` reference](/adapter-livekit/overview) for the full surface.
## Next [#next]
Wire `@spekoai/client` into your dashboard / web app.
Full adapter reference.
# Quickstart (/quickstart)
From sign-up to your first transcribe call in under five minutes.
Using Claude Code, Codex, OpenCode, Cursor, or another AI coding tool? Start by
connecting [Speko MCP](/quickstart/mcp).
## 1. Create an account and an API key [#1-create-an-account-and-an-api-key]
Sign up at [platform.speko.dev](https://platform.speko.dev), then open [API keys](https://platform.speko.dev/api-keys) and click **Create key**. Copy the raw value - it is shown once and starts with `sk_live_`.
```bash
export SPEKO_API_KEY=sk_live_xxx
```
## 2. Configure providers (optional) [#2-configure-providers-optional]
Speko uses platform-managed provider credentials by default. To bring your own keys (BYOK), open [Settings > Provider keys](https://platform.speko.dev/settings/provider-keys) and paste each provider's API key. Speko still picks the best provider per call; your keys are billed by the provider directly. See [BYOK](/concepts/byok).
## 3. Make your first call [#3-make-your-first-call]
```bash cURL
curl -X POST https://api.speko.dev/v1/transcribe \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: audio/wav" \
-H "x-speko-intent: {\"language\":\"en-US\"}" \
--data-binary @call.wav
```
```ts TypeScript
import { Speko } from '@spekoai/sdk';
import { readFile } from 'node:fs/promises';
const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
const audio = await readFile('./call.wav');
const { text, provider, model, confidence } = await speko.transcribe(audio, {
language: 'en-US',
});
console.log(text, 'from', provider, model, confidence);
```
```py Python
import os
from pathlib import Path
from spekoai import Speko
speko = Speko(api_key=os.environ["SPEKO_API_KEY"])
audio = Path("call.wav").read_bytes()
result = speko.transcribe(audio, language="en-US")
print(result.text, "from", result.provider, result.model, result.confidence)
```
The response includes `provider`, `model`, `confidence`, and `failoverCount` so you can see what actually ran. Routing headers are also returned: `X-Speko-Provider`, `X-Speko-Model`, `X-Speko-Failover-Count`, `X-Speko-Scores-Run-Id`.
## 4. Pick your next path [#4-pick-your-next-path]
Let your AI coding tool search docs, scaffold integrations, and use account tools.
`@spekoai/client` over WebRTC.
Connect Speko to a self-hosted LiveKit worker.
`/v1/transcribe`, `/v1/synthesize`, `/v1/complete` for batch and server flows.
## 5. Or let your coding agent build it [#5-or-let-your-coding-agent-build-it]
The fastest path to a working voice agent is to hand the rest to your AI coding tool. Install [Speko MCP](/quickstart/mcp):
```bash
npx @spekoai/mcp@latest init
```
Then paste this prompt into your agent:
```text
Using the Speko MCP tools, build me a voice agent:
1. Search the Speko docs for current voice agent best practices.
2. Create an agent named "my-first-agent" with a friendly assistant
system prompt and the recommended STT, LLM, and TTS for my language.
3. Deploy it and start a test session so I can talk to it.
4. Scaffold a minimal web app that connects to it with @spekoai/client.
```
The agent uses Speko MCP to create, deploy, and test the voice agent against your account, then wires up a local app you can run immediately.
# MCP (/quickstart/mcp)
Connect Speko to Claude Code, Codex, OpenCode, Cursor, and other AI coding tools.
Speko MCP gives coding agents authenticated access to Speko operational tools for organizations, agents, sessions, calls, phone numbers, knowledge bases, evals, deployment, and migration helpers.
## Install in your tool [#install-in-your-tool]
For guided setup, run the Speko MCP wizard:
```bash
npx @spekoai/mcp@latest init
```
The hosted endpoint is:
```txt
https://mcp.speko.ai/mcp
```
Authenticate with OAuth when your MCP client supports it, or use a Speko API key from [API keys](https://platform.speko.dev/api-keys).
Claude Code
Codex
OpenCode
Cursor
Other clients
Add Speko MCP with OAuth:
```bash
claude mcp add --transport http speko https://mcp.speko.ai/mcp
```
Then run `/mcp` in Claude Code and complete the browser sign-in.
To use an API key instead:
```bash
claude mcp add --transport http speko https://mcp.speko.ai/mcp \
--header "Authorization: Bearer sk_live_xxx"
```
Add Speko MCP to `~/.codex/config.toml`:
```toml
[mcp_servers.speko]
url = "https://mcp.speko.ai/mcp"
```
Then authenticate:
```bash
codex mcp login speko
```
To use an API key instead:
```toml
[mcp_servers.speko]
url = "https://mcp.speko.ai/mcp"
bearer_token_env_var = "SPEKO_API_KEY"
```
Add Speko MCP to `opencode.json`:
```json
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"speko": {
"type": "remote",
"url": "https://mcp.speko.ai/mcp",
"enabled": true
}
}
}
```
Then authenticate:
```bash
opencode mcp auth speko
```
To use an API key instead:
```json
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"speko": {
"type": "remote",
"url": "https://mcp.speko.ai/mcp",
"oauth": false,
"headers": {
"Authorization": "Bearer {env:SPEKO_API_KEY}"
},
"enabled": true
}
}
}
```
Open **Cursor Settings > MCP > Add new global MCP server**, then add:
```json
{
"mcpServers": {
"speko": {
"url": "https://mcp.speko.ai/mcp"
}
}
}
```
To use an API key instead:
```json
{
"mcpServers": {
"speko": {
"url": "https://mcp.speko.ai/mcp",
"headers": {
"Authorization": "Bearer ${env:SPEKO_API_KEY}"
}
}
}
}
```
Configure a remote MCP server with Streamable HTTP:
| Setting | Value |
| ------------ | ---------------------------------------- |
| Name | `speko` |
| URL | `https://mcp.speko.ai/mcp` |
| OAuth | Use the client's OAuth flow |
| API key auth | Send `Authorization: Bearer sk_live_xxx` |
## What to ask your agent [#what-to-ask-your-agent]
Once connected, ask your agent to:
* inspect organization usage and credit balance;
* create, update, deploy, or roll back agents;
* create sessions and inspect call transcripts or recordings;
* create phone numbers, knowledge bases, and evals;
* convert external voice-agent configs with `inspect_workspace`, `parse_external_config`, and `build_session_config`.