Chat Completions
Streaming and non-streaming chat inference across providers.
Create a chat completion
POST /v1/chat/completionsOpenAI-compatible chat inference. Send the standard request body; the gateway
maps it to the resolved provider and returns an OpenAI-shaped response with an
extra provider field indicating which upstream served the request.
const completion = await client.chat.completions.create({
model: "anthropic/claude-haiku-4-5",
messages: [
{ role: "system", content: "You are concise." },
{ role: "user", content: "Explain MoE routing in one sentence." },
],
temperature: 0.7,
});Streaming
With stream: true the response is a text/event-stream of
chat.completion.chunk frames, terminated by data: [DONE]. Set
stream_options: { include_usage: true } to receive a final usage-only chunk.
const stream = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [{ role: "user", content: "Count to five." }],
stream: true,
stream_options: { include_usage: true },
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}If the upstream fails before any bytes are sent, the gateway returns a normal
JSON error with the right status. If it fails mid-stream, a terminal error
frame is emitted followed by [DONE], matching how OpenAI clients expect
mid-stream errors.
Maniac extensions
| Field | Type | Purpose |
|---|---|---|
tags | string[] | Group requests for analytics and filtering. |
metadata | object | Arbitrary key/value metadata stored with the request. |
trace | object | Distributed-trace context (id, span_id, parent_span_id, …) for telemetry stitching. |
These are accepted alongside the OpenAI fields and ignored by stock SDKs.
Map-or-reject policy
OpenAI parameters without an internal analogue are handled predictably:
- Rejected —
n > 1returns400 invalid_request_error. - Dropped (request still runs) —
logprobs,top_logprobs,input_audiocontent parts, andjson_schemastrictness. - Passed through — unknown top-level params, so new OpenAI knobs keep working.
See the live schema and try requests in the API Reference.