Models & Modes - Sup AI Docs

The Info API provides endpoints to discover available models and modes, including their capabilities, pricing, and configuration details.

List Models

Retrieve information about all available AI models, including their capabilities, pricing, and limits.

GET https://api.sup.ai/v1/info/models

Response

Returns a JSON object:

{
  "anthropic/claude-sonnet-4.6": {
    "name": "Claude Sonnet 4.5",
    "shortName": "Sonnet 4.5",
    "description": "Our go-to model for sophisticated work...",
    "contextWindow": 200000,
    "maxOutputTokens": 64000,
    "fileInputLimit": 20,
    "features": ["tools", "pdf-input", "thinking", "image-input"],
    "legacy": null,
    "costPerMillionTokens": {
      "input": "3",
      "inputCacheRead": "0.3",
      "inputCacheWrite": "3.75",
      "output": "15",
      "reasoning": "15"
    },
    "costPerThousandNativeToolCalls": { "webSearch": "100" }
  }
}

Model Properties

name

string

Full display name of the model.

shortName

string | null

Abbreviated display name, if available.

description

string

User-facing description of the model’s capabilities and best use cases.

contextWindow

number

Maximum context window size in tokens. Example: 200000 for 200K tokens.

maxOutputTokens

number

Maximum number of tokens the model can generate in a single response.

fileInputLimit

number

Maximum number of files that can be included in a single request.

features

array

List of features supported by this model:

Feature	Description
`tools`	Function/tool calling support
`thinking`	Extended reasoning capabilities
`image-input`	Can process images in messages
`pdf-input`	Can process PDFs
`image-output`	Can generate images

legacy

object | null

Legacy information if the model is legacy:

{
  "at": "2024-01-15T00:00:00Z",
  "reason": "Replaced by newer version",
  "replacement": "anthropic/claude-sonnet-4.6"
}

costPerMillionTokens

object

Pricing in dollars per million tokens:

Field	Description
`input`	Cost per million input tokens
`inputCacheRead`	Cost per million cached input tokens (if supported)
`inputCacheWrite`	Cost per million tokens written to cache (if supported)
`output`	Cost per million output tokens
`reasoning`	Cost per million reasoning tokens (if separate billing)

costPerThousandNativeToolCalls

object

Pricing for tool calls:

Field	Description
`webSearch`	Cost per thousand web search calls
`codeExecution`	Cost per thousand code execution calls
`imageGeneration`	Cost per thousand image generation calls (if supported)

Example

curl https://api.sup.ai/v1/info/models \
  -H "Authorization: Bearer YOUR_API_KEY"

List Modes

Retrieve information about all available chat modes, including their model counts, max thinking effort, and descriptions.

GET https://api.sup.ai/v1/info/modes

Response

Returns a JSON object mapping mode IDs to their configuration:

{
  "auto": {
    "name": "Auto",
    "description": "Intelligently selects the optimal mode for your task...",
    "models": 1,
    "rank": -1,
    "maxThinkingEffort": "extra-high",
    "maxThinkingBudget": 4096
  },
  "fast": {
    "name": "Fast",
    "description": "Instant responses for trivial tasks...",
    "models": 1,
    "rank": 0,
    "maxThinkingEffort": "low",
    "maxThinkingBudget": 1024
  },
  "thinking": {
    "name": "Thinking",
    "description": "The default mode for most development work...",
    "models": 3,
    "rank": 1,
    "maxThinkingEffort": "medium",
    "maxThinkingBudget": 2048
  },
  "deep-thinking": {
    "name": "Deep Thinking",
    "description": "Advanced problem-solving for complex challenges...",
    "models": 6,
    "rank": 2,
    "maxThinkingEffort": "high",
    "maxThinkingBudget": 3072
  },
  "expert": {
    "name": "Expert",
    "description": "Maximum rigor for high-stakes, mission-critical work...",
    "models": 9,
    "rank": 3,
    "maxThinkingEffort": "extra-high",
    "maxThinkingBudget": 4096
  }
}

Mode Properties

name

string

Display name of the mode.

description

string

User-facing description of when to use this mode.

models

number

Number of AI models used in parallel for this mode.

rank

number

Priority rank of the mode. Higher values indicate more powerful modes. The auto mode has rank -1 as it dynamically selects other modes.

maxThinkingEffort

string

The maximum thinking effort level for this mode. The backend selector chooses the actual thinking effort for each individual model up to this ceiling.

Level	Description
`extra-low`	Quick, direct responses with minimal deliberation
`low`	Light reasoning for straightforward tasks
`medium`	Balanced thinking for standard work
`high`	Deep reasoning for complex problems
`extra-high`	Maximum reasoning for mission-critical tasks

maxThinkingBudget

number

Maximum token budget allocated for extended thinking. Models with thinking capabilities use this budget to reason through problems before responding.

Effort	Budget
`extra-low`	256 tokens
`low`	1,024 tokens
`medium`	2,048 tokens
`high`	3,072 tokens
`extra-high`	4,096 tokens

Example

curl https://api.sup.ai/v1/info/modes \
  -H 'Authorization: Bearer YOUR_API_KEY'

Mode Selection Guide

auto

Best for: Most use casesThe auto mode intelligently analyzes your prompt and selects the optimal mode. It starts with minimal resources and escalates when complexity is detected. This is the recommended default for most applications.

fast

Best for: Trivial tasks, typo fixes, simple formattingSingle model with minimal thinking. Use when speed is critical and the task is straightforward. Examples: spell checking, simple code formatting, basic calculations.

thinking

Best for: Standard development workThree models with medium thinking effort. The default for most coding tasks, debugging, feature implementation, and analysis. Balances quality with response time.

deep-thinking

Best for: Complex multi-faceted problemsSix models with high thinking effort. Use for architectural decisions, system design, complex debugging, and problems requiring parallel exploration of multiple approaches.

pro

Best for: High-stakes, mission-critical workNine models with extra-high thinking effort. Reserved for legal, financial, medical, or regulatory work where maximum accuracy and verification are essential.

image

Best for: Visual content creationSpecialized mode for generating and editing images. Optimized for interpreting visual descriptions and producing high-quality imagery.

Error Responses

Status	Description
`401`	Unauthorized - missing or invalid API key

Getting Started

API

​List Models

​Response

​Model Properties

​Example

​List Modes

​Response

​Mode Properties

​Example

​Mode Selection Guide

​Error Responses

List Models

Response

Model Properties

Example

List Modes

Response

Mode Properties

Example

Mode Selection Guide

Error Responses