Skip to main content
Run a single-shot LLM completion. The router picks the best LLM provider for your intent and fails over automatically.
from spekoai import ChatMessage, RoutingIntent

reply = speko.complete(
    messages=[ChatMessage(role="user", content="Hi!")],
    intent=RoutingIntent(language="en", vertical="general"),
)
print(reply.text, reply.provider, reply.usage.prompt_tokens)

Signature

Speko.complete(
    *,
    messages: list[ChatMessage | dict],
    intent: RoutingIntent | dict,
    system_prompt: str | None = None,
    temperature: float | None = None,
    max_tokens: int | None = None,
    constraints: PipelineConstraints | dict | None = None,
) -> CompleteResult

Parameters

messages
list[ChatMessage | dict]
required
Conversation history. Roles: system, user, assistant. Dicts are validated against ChatMessage on the way in.
intent
RoutingIntent | dict
required
Routing intent — language, vertical, optional optimize_for.
system_prompt
string
Shortcut for a leading system message. Providers with a native system channel use it directly; others fold it into the message list.
temperature
float
Forwarded to the provider. Omit to use the provider’s default.
max_tokens
int
Max completion tokens. Omit to use the provider’s default.
constraints
PipelineConstraints | dict

Returns — CompleteResult

text
string
Assistant reply.
provider
string
model
string
usage.prompt_tokens
int
usage.completion_tokens
int
failover_count
int
scores_run_id
string | None
/v1/complete is buffered — each call returns one full completion. Streaming and tool / function calling are on the roadmap.

Example — multi-turn

messages = [
    {"role": "system", "content": "You are a concise voice assistant."},
    {"role": "user", "content": "Book me an appointment for Tuesday."},
]

first = speko.complete(
    messages=messages,
    intent={"language": "en", "vertical": "healthcare"},
    temperature=0.3,
    max_tokens=200,
)

messages.append({"role": "assistant", "content": first.text})
messages.append({"role": "user", "content": "3pm, with Dr. Chen."})

second = speko.complete(
    messages=messages,
    intent={"language": "en", "vertical": "healthcare"},
)

Example — pin a provider

from spekoai import AllowedProviders, PipelineConstraints

reply = speko.complete(
    messages=[{"role": "user", "content": "…"}],
    intent={"language": "en", "vertical": "finance"},
    constraints=PipelineConstraints(
        allowed_providers=AllowedProviders(llm=["anthropic"]),
    ),
)