complete
POST /v1/complete — LLM completion with automatic provider routing.
Run a single-shot LLM completion. The router picks the best LLM provider for your intent and fails over automatically.
from spekoai import ChatMessage, RoutingIntent
reply = speko.complete(
messages=[ChatMessage(role="user", content="Hi!")],
intent=RoutingIntent(language="en"),
)
print(reply.text, reply.provider, reply.usage.prompt_tokens)Signature
Speko.complete(
*,
messages: list[ChatMessage | dict],
intent: RoutingIntent | dict,
system_prompt: str | None = None,
temperature: float | None = None,
max_tokens: int | None = None,
constraints: PipelineConstraints | dict | None = None,
) -> CompleteResultawait AsyncSpeko.complete(
*,
messages: list[ChatMessage | dict],
intent: RoutingIntent | dict,
system_prompt: str | None = None,
temperature: float | None = None,
max_tokens: int | None = None,
constraints: PipelineConstraints | dict | None = None,
) -> CompleteResultParameters
messageslist[ChatMessage | dict]requiredConversation history. Roles: system, user, assistant. Dicts are validated against ChatMessage on the way in.
intentRoutingIntent | dictrequiredRouting intent — language, optional optimize_for.
system_promptstringShortcut for a leading system message. Providers with a native system channel use it directly; others fold it into the message list.
temperaturefloatForwarded to the provider. Omit to use the provider's default.
max_tokensintMax completion tokens. Omit to use the provider's default.
constraintsPipelineConstraints | dictReturns — CompleteResult
textstringAssistant reply.
providerstringmodelstringusage.prompt_tokensintusage.completion_tokensintfailover_countintscores_run_idstring | None/v1/complete streams over the wire. The Python SDK consumes the stream and
returns the final CompleteResult; explicit Python streaming helpers are not
exposed yet.
Example — multi-turn
messages = [
{"role": "system", "content": "You are a concise voice assistant."},
{"role": "user", "content": "Book me an appointment for Tuesday."},
]
first = speko.complete(
messages=messages,
intent={"language": "en"},
temperature=0.3,
max_tokens=200,
)
messages.append({"role": "assistant", "content": first.text})
messages.append({"role": "user", "content": "3pm, with Dr. Chen."})
second = speko.complete(
messages=messages,
intent={"language": "en"},
)Example — pin a provider
from spekoai import AllowedProviders, PipelineConstraints
reply = speko.complete(
messages=[{"role": "user", "content": "…"}],
intent={"language": "en"},
constraints=PipelineConstraints(
allowed_providers=AllowedProviders(llm=["anthropic"]),
),
)