One-shot APIs
POST /v1/transcribe, /v1/synthesize, /v1/complete — single-turn calls without sessions.
For batch transcription, server-side TTS, and non-voice LLM completions you don't need a real-time session — just call the one-shot endpoints directly. Each is a single round-trip with built-in routing and failover.
Auth
Every one-shot call needs a bearer API key. Mint one at API keys.
Authorization: Bearer sk_live_...Transcribe
curl -X POST https://api.speko.dev/v1/transcribe \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: audio/wav" \
-H "x-speko-intent: {\"language\":\"en-US\"}" \
--data-binary @call.wavResponse:
{
"text": "...",
"provider": "deepgram",
"model": "nova-2",
"confidence": 0.94,
"failoverCount": 0,
"scoresRunId": "..."
}Notes:
- Audio body is binary. Wrap PCM/MP3/WAV/etc. in the request body — no base64.
- Intent goes in the
x-speko-intentheader (JSON), not the body. Constraints inx-speko-constraints.
Synthesize
curl -X POST https://api.speko.dev/v1/synthesize \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, world.",
"intent": { "language": "en-US" },
"voice": null,
"speed": 1
}' \
--output speech.binResponse body is the audio. Content-Type indicates the format (e.g. audio/pcm;rate=24000 for Cartesia, audio/mpeg for ElevenLabs). Routing headers (X-Speko-Provider, etc.) tell you which provider ran.
Complete
curl -X POST https://api.speko.dev/v1/complete \
-H "Authorization: Bearer $SPEKO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are concise." },
{ "role": "user", "content": "Hi!" }
],
"intent": { "language": "en" }
}'Response:
{
"text": "Hello!",
"provider": "openai",
"model": "gpt-4o-mini",
"usage": { "promptTokens": 14, "completionTokens": 4 },
"failoverCount": 0,
"scoresRunId": "..."
}With an SDK
Both SDKs wrap all three endpoints with matching shapes. See @spekoai/sdk and spekoai (Python).
import { Speko } from '@spekoai/sdk';
const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
const t = await speko.transcribe(buf, { language: 'en-US' });
const a = await speko.synthesize('Hello', { language: 'en' });
const c = await speko.complete({
messages: [{ role: 'user', content: 'Hi!' }],
intent: { language: 'en' },
});from spekoai import Speko
speko = Speko(api_key=os.environ["SPEKO_API_KEY"])
t = speko.transcribe(buf, language="en-US")
a = speko.synthesize("Hello", language="en")
c = speko.complete(
messages=[{"role": "user", "content": "Hi!"}],
intent={"language": "en"},
)When not to use one-shot
If you need real-time voice with sub-second latency, barge-in, and partial transcripts — use sessions. One-shot endpoints are for batch and server-internal flows.