Skip to main content
POST
/
openai
/
chat
/
completions
Create a chat completion
curl --request POST \
  --url https://api.sup.ai/v1/openai/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "messages": [
    {
      "content": "You are a helpful assistant.",
      "role": "system"
    },
    {
      "content": "Hello!",
      "role": "user"
    }
  ],
  "environment": {
    "date": "2024-01-15T10:30:00Z",
    "location": {
      "ip_address": "current"
    },
    "user_name": "John Doe"
  },
  "include_supai_chunks": false,
  "model": "auto",
  "models": null,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}
'
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The capital of France is Paris.",
        "role": "assistant"
      }
    }
  ],
  "created": 1705312200,
  "id": "chatcmpl-abc123",
  "model": "auto",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 150,
    "prompt_tokens": 50,
    "total_tokens": 200
  }
}

Authorizations

Authorization
string
header
required

API key authentication. Use your Supai API key as the bearer token.

Body

application/json
messages
object[]
required

List of messages in the conversation. Must end with a user message.

Minimum array length: 1
Example:
[
{
"content": "You are a helpful assistant.",
"role": "system"
},
{ "content": "Hello!", "role": "user" }
]
environment
object

User environment context including date, location, and name

include_supai_chunks
boolean
default:false

Whether to include Supai-specific chunk data in the stream

Example:

false

model
enum<string>
default:auto

The mode ID to use for generation. "auto" will automatically select the best mode.

Available options:
auto,
deep-thinking,
fast,
pro,
thinking
Example:

"auto"

models
enum<string>[] | null

Specific model IDs to use. If null, all non-deprecated models are available.

Available options:
alibaba/qwen-3-235b,
alibaba/qwen3-coder-30b-a3b,
alibaba/qwen3-max,
alibaba/qwen3-next-80b-a3b-thinking,
alibaba/qwen3-vl-thinking,
anthropic/claude-4.5-haiku,
anthropic/claude-opus-4.1,
anthropic/claude-opus-4.5,
anthropic/claude-sonnet-4.5,
deepseek/deepseek-v3.2,
deepseek/deepseek-v3.2-exp,
deepseek/deepseek-v3.2-exp-thinking,
deepseek/deepseek-v3.2-speciale,
deepseek/deepseek-v3.2-thinking,
google/gemini-2.5-flash,
google/gemini-2.5-flash-image,
google/gemini-2.5-flash-lite,
google/gemini-2.5-pro,
google/gemini-3-flash,
google/gemini-3-pro-image,
google/gemini-3-pro-preview,
meta/llama-3.3-70b,
meta/llama-4-maverick,
meta/llama-4-scout,
minimax/minimax-m2,
minimax/minimax-m2.1,
mistral/magistral-medium,
mistral/mistral-large,
mistral/mistral-medium,
mistral/mistral-small,
mistral/pixtral-12b,
moonshotai/kimi-k2-thinking-turbo,
moonshotai/kimi-k2-turbo,
openai/gpt-5,
openai/gpt-5-mini,
openai/gpt-5-nano,
openai/gpt-5-pro,
openai/gpt-5.1,
openai/gpt-5.1-instant,
openai/gpt-5.1-thinking,
openai/gpt-5.2,
openai/gpt-5.2-pro,
xai/grok-4,
xai/grok-4-fast-non-reasoning,
xai/grok-4-fast-reasoning,
xai/grok-4.1-fast-non-reasoning,
xai/grok-4.1-fast-reasoning,
zai/glm-4.5-air,
zai/glm-4.6,
zai/glm-4.6v,
zai/glm-4.7
Example:

null

stream
boolean
default:false

Whether to stream the response using Server-Sent Events

Example:

true

stream_options
object

Options for streaming responses

Response

Successful chat completion response. Returns JSON for non-streaming or SSE for streaming.

choices
object[]
required

List of generated responses

created
number
required

Unix timestamp of when the response was created

Example:

1705312200

id
string
required

Unique identifier for the chat completion

Example:

"chatcmpl-abc123"

model
string
required

The mode used for this completion

Example:

"auto"

object
enum<string>
required
Available options:
chat.completion
usage
object
required