Skip to main content
POST
/
v1
/
chat
/
completions
Openai Compatible Chat Completions Api
curl --request POST \
  --url https://api.compilelabs.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {
      "content": "How can I help you today?",
      "role": "developer",
      "name": "jane_doe"
    }
  ],
  "stream": false,
  "tool_choice": "auto",
  "tools": [
    {
      "function": {
        "name": "<string>",
        "parameters": {
          "type": "object",
          "required": "location",
          "properties": {
            "location": {
              "description": "The city and state, e.g. San Francisco, CA",
              "type": "string"
            },
            "unit": {
              "description": "Unit for the output - one of (celsius, fahrenheit)",
              "type": "string"
            }
          }
        },
        "description": "Get the current weather in a given location"
      },
      "type": "function"
    }
  ],
  "parallel_tool_calls": true,
  "max_completion_tokens": 1024,
  "temperature": 0.7,
  "top_p": 0.9,
  "response_format": {
    "json_schema": {
      "properties": {
        "location": {
          "description": "The city and state, e.g. San Francisco, CA",
          "type": "string"
        },
        "unit": {
          "description": "Unit for the output - one of (celsius, fahrenheit)",
          "type": "string"
        }
      },
      "required": [
        "location",
        "unit"
      ],
      "type": "object"
    },
    "type": "json_schema"
  }
}
'
{
  "id": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "system",
        "content": "How can I help you today?",
        "tool_calls": {
          "function": {
            "arguments": "{'location': 'San Francisco'}",
            "name": "get_weather"
          },
          "id": "tool_call_1",
          "type": "function"
        }
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "object": "chat.completion"
}

Headers

x-api-key
string | null

Your API key.

Body

application/json
model
string
required

Model ID used to generate the response, like meta/llama-4-scout or moonshotai/kimi-k2-0905. CompileLabs offers a wide range of models with different capabilities, performance characteristics, and price points. Refer to the model guide to browse and compare available models.

Required string length: 3 - 255
messages
(ChatCompletionDeveloperMessageParam · object | ChatCompletionSystemMessageParam · object | ChatCompletionUserMessageParam · object | ChatCompletionToolMessageParam · object | ChatCompletionAssistantMessageParam · object)[]
required

A list of messages comprising the conversation so far.

Minimum array length: 1
  • ChatCompletionDeveloperMessageParam
  • ChatCompletionSystemMessageParam
  • ChatCompletionUserMessageParam
  • ChatCompletionToolMessageParam
  • ChatCompletionAssistantMessageParam
stream
boolean | null
default:false

If set to true, the model response data will be streamed to the client as it is generated using server-sent events. See the Streaming section below for more information, along with the streaming responses guide for more information on how to handle the streaming events.

tool_choice

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {'type': 'function', 'function': {'name': 'my_function'}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

Available options:
auto,
none,
required
tools
FunctionTool · object[] | null

A list of tools the model may call. Currently, only functions are supported as a tool.

parallel_tool_calls
boolean
default:true

Whether to enable parallel function calling.

max_completion_tokens
integer | null

An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

Required range: x > 1
Example:

1024

temperature
number | null
default:1

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

Required range: 0 <= x <= 2
Example:

0.7

top_p
number | null
default:1

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

Required range: 0 <= x <= 1
Example:

0.9

response_format
Text · object

Setting to { 'type': 'json_schema', 'json_schema': {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

  • Text
  • JSONSchema
Example:
{
"json_schema": {
"properties": {
"location": {
"description": "The city and state, e.g. San Francisco, CA",
"type": "string"
},
"unit": {
"description": "Unit for the output - one of (celsius, fahrenheit)",
"type": "string"
}
},
"required": ["location", "unit"],
"type": "object"
},
"type": "json_schema"
}

Response

Successful Response

id
string
required

A unique identifier for the chat completion.

created
integer
required

The Unix time in seconds when the response was generated.

model
string
required

The model used for the chat completion.

choices
ChatCompletionChoice · object[]
required

The list of chat completion choices.

usage
UsageInfo · object
required

Usage statistics for the completion request.

object
string
default:chat.completion

The object type, which is always 'chat.completion'.

Allowed value: "chat.completion"