Speko Docs

complete

POST /v1/complete — LLM completion with automatic provider routing.

Run a single-shot LLM completion. The router picks the best LLM provider for your intent and fails over automatically.

from spekoai import ChatMessage, RoutingIntent

reply = speko.complete(
    messages=[ChatMessage(role="user", content="Hi!")],
    intent=RoutingIntent(language="en"),
)
print(reply.text, reply.provider, reply.usage.prompt_tokens)

Signature

Speko.complete(
    *,
    messages: list[ChatMessage | dict],
    intent: RoutingIntent | dict,
    system_prompt: str | None = None,
    temperature: float | None = None,
    max_tokens: int | None = None,
    constraints: PipelineConstraints | dict | None = None,
) -> CompleteResult
await AsyncSpeko.complete(
    *,
    messages: list[ChatMessage | dict],
    intent: RoutingIntent | dict,
    system_prompt: str | None = None,
    temperature: float | None = None,
    max_tokens: int | None = None,
    constraints: PipelineConstraints | dict | None = None,
) -> CompleteResult

Parameters

messageslist[ChatMessage | dict]required

Conversation history. Roles: system, user, assistant. Dicts are validated against ChatMessage on the way in.

intentRoutingIntent | dictrequired

Routing intent — language, optional optimize_for.

system_promptstring

Shortcut for a leading system message. Providers with a native system channel use it directly; others fold it into the message list.

temperaturefloat

Forwarded to the provider. Omit to use the provider's default.

max_tokensint

Max completion tokens. Omit to use the provider's default.

constraintsPipelineConstraints | dict

Returns — CompleteResult

textstring

Assistant reply.

providerstring
modelstring
usage.prompt_tokensint
usage.completion_tokensint
failover_countint
scores_run_idstring | None

/v1/complete streams over the wire. The Python SDK consumes the stream and returns the final CompleteResult; explicit Python streaming helpers are not exposed yet.

Example — multi-turn

messages = [
    {"role": "system", "content": "You are a concise voice assistant."},
    {"role": "user", "content": "Book me an appointment for Tuesday."},
]

first = speko.complete(
    messages=messages,
    intent={"language": "en"},
    temperature=0.3,
    max_tokens=200,
)

messages.append({"role": "assistant", "content": first.text})
messages.append({"role": "user", "content": "3pm, with Dr. Chen."})

second = speko.complete(
    messages=messages,
    intent={"language": "en"},
)

Example — pin a provider

from spekoai import AllowedProviders, PipelineConstraints

reply = speko.complete(
    messages=[{"role": "user", "content": "…"}],
    intent={"language": "en"},
    constraints=PipelineConstraints(
        allowed_providers=AllowedProviders(llm=["anthropic"]),
    ),
)

On this page