Chat Completions

POST /v1/chat/completions — the OpenAI-compatible chat endpoint of Clipia AI Gateway. Non-streaming and streaming responses, parameters, tool calling and structured outputs.

POST /v1/chat/completions is the core endpoint of Clipia AI Gateway, fully compatible with OpenAI Chat Completions. It takes a messages array and returns the assistant's response in non-streaming or streaming mode, with support for tool calling and structured output by JSON schema.

POST/v1/chat/completions

Non-streaming response

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["CLIPIA_API_KEY"],
    base_url="https://api.clipia.ai/v1",
)

resp = client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Tell me three facts about Mars."},
    ],
    temperature=0.7,
    max_tokens=512,
)

print(resp.choices[0].message.content)
print(resp.usage)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.CLIPIA_API_KEY,
  baseURL: "https://api.clipia.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "claude-opus-4-8",
  messages: [
    { role: "system", content: "You are a concise assistant." },
    { role: "user", content: "Tell me three facts about Mars." },
  ],
  temperature: 0.7,
  max_tokens: 512,
});

console.log(resp.choices[0].message.content);

curl https://api.clipia.ai/v1/chat/completions \
  -H "Authorization: Bearer $CLIPIA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-8",
    "messages": [
      { "role": "system", "content": "You are a concise assistant." },
      { "role": "user", "content": "Tell me three facts about Mars." }
    ],
    "temperature": 0.7,
    "max_tokens": 512
  }'

Response 200

{
  "id": "chatcmpl-3f9a1c7e2b41",
  "object": "chat.completion",
  "created": 1782300000,
  "model": "claude-opus-4-8",
  "provider": "Clipia",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "1. ...\n2. ...\n3. ..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 64,
    "total_tokens": 92,
    "cost": 0.046
  }
}

usage.cost is the cost of the request in credits. finish_reason is normalized to the OpenAI enum: stop, length, tool_calls, content_filter.

Streaming response

Pass stream: true — the response arrives incrementally as Server-Sent Events. Each event is a chat.completion.chunk with a delta; the stream ends with the literal data: [DONE].

stream = client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[{"role": "user", "content": "Write a haiku about the sea."}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)

const stream = await client.chat.completions.create({
  model: "claude-opus-4-8",
  messages: [{ role: "user", content: "Write a haiku about the sea." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

curl -N https://api.clipia.ai/v1/chat/completions \
  -H "Authorization: Bearer $CLIPIA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-8",
    "messages": [{ "role": "user", "content": "Write a haiku about the sea." }],
    "stream": true
  }'

Event stream

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1782300000,"model":"claude-opus-4-8","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","model":"claude-opus-4-8","choices":[{"index":0,"delta":{"content":"Waves "},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","model":"claude-opus-4-8","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Final usage in a stream

To receive usage while streaming, pass stream_options: { "include_usage": true }. An extra chunk with a populated usage and an empty choices: [] is then sent right before data: [DONE].

Request parameters

Prop

Type

Parameter compatibility

Both token-limit names are accepted — max_tokens and max_completion_tokens. Unknown top-level fields are ignored (they do not cause a 400), so code written for OpenAI ports over unchanged.

Tool / function calling

Full cycle: you describe tools in tools, the model returns tool_calls, you run the function on your side and send the result back as a message with role: "tool", after which the model produces the final answer.

import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Current weather in a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                },
                "required": ["city"],
            },
        },
    }
]

messages = [{"role": "user", "content": "What's the weather in Moscow right now?"}]

# 1) The model decides to call a tool
resp = client.chat.completions.create(
    model="claude-opus-4-8", messages=messages, tools=tools,
)
msg = resp.choices[0].message
# resp.choices[0].finish_reason == "tool_calls"

# 2) Run the function on your side
call = msg.tool_calls[0]
args = json.loads(call.function.arguments)   # {"city": "Moscow"}
result = {"temp_c": 14, "condition": "cloudy"}

# 3) Send the result back and get the final answer
messages.append(msg)  # assistant echo with tool_calls
messages.append({
    "role": "tool",
    "tool_call_id": call.id,
    "content": json.dumps(result),
})

final = client.chat.completions.create(
    model="claude-opus-4-8", messages=messages, tools=tools,
)
print(final.choices[0].message.content)

When streaming, tool_calls fragments arrive grouped by index: the function name and id come in the first delta for that index, then function.arguments fragments you concatenate into valid JSON. A completed tool call is marked with finish_reason: "tool_calls".

Structured outputs

To force a response that matches a JSON schema, pass response_format with type: "json_schema" and strict: true.

resp = client.chat.completions.create(
    model="gpt-5.5",
    messages=[{"role": "user", "content": "Extract name and age: Anna is 30."}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                },
                "required": ["name", "age"],
                "additionalProperties": False,
            },
        },
    },
)

print(resp.choices[0].message.content)  # {"name": "Anna", "age": 30}

json_object vs json_schema

response_format: { "type": "json_object" } guarantees valid JSON but without a specific schema — and requires the word "json" to appear in messages. For strict structural conformance use json_schema with strict: true.

Errors

Errors arrive in the standard OpenAI envelope: { "error": { "message", "type", "param", "code" } }, always with Content-Type: application/json and the correct HTTP status.

Status	Code	When
`400`	`invalid_request_error`	Malformed request body or parameters.
`401`	`invalid_api_key`	Missing or invalid key.
`402`	`insufficient_credits`	Not enough credits on the balance.
`404`	`model_not_found`	Unknown `model`.
`429`	`rate_limit_exceeded`	Rate limit exceeded — see Account & limits.

{
  "error": {
    "message": "Not enough credits on the balance.",
    "type": "insufficient_credits",
    "param": null,
    "code": "insufficient_credits"
  }
}

POST/v1/chat/completions

Non-streaming response

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["CLIPIA_API_KEY"],
    base_url="https://api.clipia.ai/v1",
)

resp = client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Tell me three facts about Mars."},
    ],
    temperature=0.7,
    max_tokens=512,
)

print(resp.choices[0].message.content)
print(resp.usage)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.CLIPIA_API_KEY,
  baseURL: "https://api.clipia.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "claude-opus-4-8",
  messages: [
    { role: "system", content: "You are a concise assistant." },
    { role: "user", content: "Tell me three facts about Mars." },
  ],
  temperature: 0.7,
  max_tokens: 512,
});

console.log(resp.choices[0].message.content);

curl https://api.clipia.ai/v1/chat/completions \
  -H "Authorization: Bearer $CLIPIA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-8",
    "messages": [
      { "role": "system", "content": "You are a concise assistant." },
      { "role": "user", "content": "Tell me three facts about Mars." }
    ],
    "temperature": 0.7,
    "max_tokens": 512
  }'

Response 200

{
  "id": "chatcmpl-3f9a1c7e2b41",
  "object": "chat.completion",
  "created": 1782300000,
  "model": "claude-opus-4-8",
  "provider": "Clipia",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "1. ...\n2. ...\n3. ..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 64,
    "total_tokens": 92,
    "cost": 0.046
  }
}

usage.cost is the cost of the request in credits. finish_reason is normalized to the OpenAI enum: stop, length, tool_calls, content_filter.

Streaming response

Pass stream: true — the response arrives incrementally as Server-Sent Events. Each event is a chat.completion.chunk with a delta; the stream ends with the literal data: [DONE].

stream = client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[{"role": "user", "content": "Write a haiku about the sea."}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)

const stream = await client.chat.completions.create({
  model: "claude-opus-4-8",
  messages: [{ role: "user", content: "Write a haiku about the sea." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

curl -N https://api.clipia.ai/v1/chat/completions \
  -H "Authorization: Bearer $CLIPIA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-8",
    "messages": [{ "role": "user", "content": "Write a haiku about the sea." }],
    "stream": true
  }'

Event stream

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1782300000,"model":"claude-opus-4-8","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","model":"claude-opus-4-8","choices":[{"index":0,"delta":{"content":"Waves "},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","model":"claude-opus-4-8","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Final usage in a stream

To receive usage while streaming, pass stream_options: { "include_usage": true }. An extra chunk with a populated usage and an empty choices: [] is then sent right before data: [DONE].

Request parameters

Prop

Type

Parameter compatibility

Both token-limit names are accepted — max_tokens and max_completion_tokens. Unknown top-level fields are ignored (they do not cause a 400), so code written for OpenAI ports over unchanged.

Tool / function calling

import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Current weather in a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                },
                "required": ["city"],
            },
        },
    }
]

messages = [{"role": "user", "content": "What's the weather in Moscow right now?"}]

# 1) The model decides to call a tool
resp = client.chat.completions.create(
    model="claude-opus-4-8", messages=messages, tools=tools,
)
msg = resp.choices[0].message
# resp.choices[0].finish_reason == "tool_calls"

# 2) Run the function on your side
call = msg.tool_calls[0]
args = json.loads(call.function.arguments)   # {"city": "Moscow"}
result = {"temp_c": 14, "condition": "cloudy"}

# 3) Send the result back and get the final answer
messages.append(msg)  # assistant echo with tool_calls
messages.append({
    "role": "tool",
    "tool_call_id": call.id,
    "content": json.dumps(result),
})

final = client.chat.completions.create(
    model="claude-opus-4-8", messages=messages, tools=tools,
)
print(final.choices[0].message.content)

Structured outputs

To force a response that matches a JSON schema, pass response_format with type: "json_schema" and strict: true.

resp = client.chat.completions.create(
    model="gpt-5.5",
    messages=[{"role": "user", "content": "Extract name and age: Anna is 30."}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                },
                "required": ["name", "age"],
                "additionalProperties": False,
            },
        },
    },
)

print(resp.choices[0].message.content)  # {"name": "Anna", "age": 30}

json_object vs json_schema

Errors

Errors arrive in the standard OpenAI envelope: { "error": { "message", "type", "param", "code" } }, always with Content-Type: application/json and the correct HTTP status.

Status	Code	When
`400`	`invalid_request_error`	Malformed request body or parameters.
`401`	`invalid_api_key`	Missing or invalid key.
`402`	`insufficient_credits`	Not enough credits on the balance.
`404`	`model_not_found`	Unknown `model`.
`429`	`rate_limit_exceeded`	Rate limit exceeded — see Account & limits.

{
  "error": {
    "message": "Not enough credits on the balance.",
    "type": "insufficient_credits",
    "param": null,
    "code": "insufficient_credits"
  }
}

Non-streaming response

Streaming response

Request parameters

Tool / function calling

Structured outputs

Errors

On this page

Chat Completions

Non-streaming response

Streaming response

Request parameters

Tool / function calling

Structured outputs

Errors

On this page