Chat Completions
POST /v1/chat/completions — the OpenAI-compatible chat endpoint of Clipia AI Gateway. Non-streaming and streaming responses, parameters, tool calling and structured outputs.
POST /v1/chat/completions is the core endpoint of Clipia AI Gateway, fully compatible with OpenAI Chat Completions. It takes a messages array and returns the assistant's response in non-streaming or streaming mode, with support for tool calling and structured output by JSON schema.
/v1/chat/completionsNon-streaming response
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["CLIPIA_API_KEY"],
base_url="https://api.clipia.ai/v1",
)
resp = client.chat.completions.create(
model="claude-opus-4-8",
messages=[
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "Tell me three facts about Mars."},
],
temperature=0.7,
max_tokens=512,
)
print(resp.choices[0].message.content)
print(resp.usage)import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.CLIPIA_API_KEY,
baseURL: "https://api.clipia.ai/v1",
});
const resp = await client.chat.completions.create({
model: "claude-opus-4-8",
messages: [
{ role: "system", content: "You are a concise assistant." },
{ role: "user", content: "Tell me three facts about Mars." },
],
temperature: 0.7,
max_tokens: 512,
});
console.log(resp.choices[0].message.content);curl https://api.clipia.ai/v1/chat/completions \
-H "Authorization: Bearer $CLIPIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-8",
"messages": [
{ "role": "system", "content": "You are a concise assistant." },
{ "role": "user", "content": "Tell me three facts about Mars." }
],
"temperature": 0.7,
"max_tokens": 512
}'Response 200
{
"id": "chatcmpl-3f9a1c7e2b41",
"object": "chat.completion",
"created": 1782300000,
"model": "claude-opus-4-8",
"provider": "Clipia",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "1. ...\n2. ...\n3. ..." },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 64,
"total_tokens": 92,
"cost": 0.046
}
}usage.cost is the cost of the request in credits. finish_reason is normalized to the OpenAI enum: stop, length, tool_calls, content_filter.
Streaming response
Pass stream: true — the response arrives incrementally as Server-Sent Events. Each event is a chat.completion.chunk with a delta; the stream ends with the literal data: [DONE].
stream = client.chat.completions.create(
model="claude-opus-4-8",
messages=[{"role": "user", "content": "Write a haiku about the sea."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)const stream = await client.chat.completions.create({
model: "claude-opus-4-8",
messages: [{ role: "user", content: "Write a haiku about the sea." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}curl -N https://api.clipia.ai/v1/chat/completions \
-H "Authorization: Bearer $CLIPIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-8",
"messages": [{ "role": "user", "content": "Write a haiku about the sea." }],
"stream": true
}'Event stream
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1782300000,"model":"claude-opus-4-8","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","model":"claude-opus-4-8","choices":[{"index":0,"delta":{"content":"Waves "},"finish_reason":null}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","model":"claude-opus-4-8","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]Final usage in a stream
To receive usage while streaming, pass stream_options: { "include_usage": true }. An extra chunk with a populated usage and an empty choices: [] is then sent right before data: [DONE].
Request parameters
Prop
Type
Parameter compatibility
Both token-limit names are accepted — max_tokens and max_completion_tokens. Unknown top-level fields are ignored (they do not cause a 400), so code written for OpenAI ports over unchanged.
Tool / function calling
Full cycle: you describe tools in tools, the model returns tool_calls, you run the function on your side and send the result back as a message with role: "tool", after which the model produces the final answer.
import json
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Current weather in a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
}
]
messages = [{"role": "user", "content": "What's the weather in Moscow right now?"}]
# 1) The model decides to call a tool
resp = client.chat.completions.create(
model="claude-opus-4-8", messages=messages, tools=tools,
)
msg = resp.choices[0].message
# resp.choices[0].finish_reason == "tool_calls"
# 2) Run the function on your side
call = msg.tool_calls[0]
args = json.loads(call.function.arguments) # {"city": "Moscow"}
result = {"temp_c": 14, "condition": "cloudy"}
# 3) Send the result back and get the final answer
messages.append(msg) # assistant echo with tool_calls
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": json.dumps(result),
})
final = client.chat.completions.create(
model="claude-opus-4-8", messages=messages, tools=tools,
)
print(final.choices[0].message.content)When streaming, tool_calls fragments arrive grouped by index: the function name and id come in the first delta for that index, then function.arguments fragments you concatenate into valid JSON. A completed tool call is marked with finish_reason: "tool_calls".
Structured outputs
To force a response that matches a JSON schema, pass response_format with type: "json_schema" and strict: true.
resp = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "Extract name and age: Anna is 30."}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "person",
"strict": True,
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
},
"required": ["name", "age"],
"additionalProperties": False,
},
},
},
)
print(resp.choices[0].message.content) # {"name": "Anna", "age": 30}json_object vs json_schema
response_format: { "type": "json_object" } guarantees valid JSON but without a specific schema — and requires the word "json" to appear in messages. For strict structural conformance use json_schema with strict: true.
Errors
Errors arrive in the standard OpenAI envelope: { "error": { "message", "type", "param", "code" } }, always with Content-Type: application/json and the correct HTTP status.
| Status | Code | When |
|---|---|---|
400 | invalid_request_error | Malformed request body or parameters. |
401 | invalid_api_key | Missing or invalid key. |
402 | insufficient_credits | Not enough credits on the balance. |
404 | model_not_found | Unknown model. |
429 | rate_limit_exceeded | Rate limit exceeded — see Account & limits. |
{
"error": {
"message": "Not enough credits on the balance.",
"type": "insufficient_credits",
"param": null,
"code": "insufficient_credits"
}
}Clipia AI Gateway
A single OpenAI-compatible API to Claude, GPT, Gemini, DeepSeek and Grok, billed in rubles. Get a key and make your first request in minutes.
Models & pricing
The Clipia AI Gateway model catalog (GET /v1/models) and prices in rubles per 1M tokens — Claude, GPT, Gemini, DeepSeek, Grok.