Create a chat completion using OpenAI compatible SDKs or tools.
Your API key.
Model ID used to generate the response, like meta/llama-4-scout or moonshotai/kimi-k2-0905. CompileLabs offers a wide range of models with different capabilities, performance characteristics, and price points. Refer to the model guide to browse and compare available models.
3 - 255A list of messages comprising the conversation so far.
1If set to true, the model response data will be streamed to the client as it is generated using server-sent events. See the Streaming section below for more information, along with the streaming responses guide for more information on how to handle the streaming events.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {'type': 'function', 'function': {'name': 'my_function'}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
auto, none, required A list of tools the model may call. Currently, only functions are supported as a tool.
Whether to enable parallel function calling.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
x > 11024
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
0 <= x <= 20.7
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
0 <= x <= 10.9
Setting to { 'type': 'json_schema', 'json_schema': {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.
{
"json_schema": {
"properties": {
"location": {
"description": "The city and state, e.g. San Francisco, CA",
"type": "string"
},
"unit": {
"description": "Unit for the output - one of (celsius, fahrenheit)",
"type": "string"
}
},
"required": ["location", "unit"],
"type": "object"
},
"type": "json_schema"
}Successful Response
A unique identifier for the chat completion.
The Unix time in seconds when the response was generated.
The model used for the chat completion.
The list of chat completion choices.
Usage statistics for the completion request.
The object type, which is always 'chat.completion'.
"chat.completion"