List Models
Retrieve information about all available AI models, including their capabilities, pricing, and limits.Response
Returns a JSON object withtext and image model categories:
Model Properties
Full display name of the model.
Abbreviated display name, if available.
User-facing description of the model’s capabilities and best use cases.
Maximum context window size in tokens. Example:
200000 for 200K tokens.Maximum number of tokens the model can generate in a single response.
Maximum number of files that can be included in a single request.
List of features supported by this model:
| Feature | Description |
|---|---|
tools | Function/tool calling support |
thinking | Extended reasoning capabilities |
image-input | Can process images in messages |
pdf-input | Can process PDFs |
image-output | Can generate images |
Deprecation information if the model is deprecated:
Pricing in nano-dollars (1 billionth of a dollar) per million tokens:
| Field | Description |
|---|---|
input | Cost per million input tokens |
inputCacheRead | Cost per million cached input tokens (if supported) |
inputCacheWrite | Cost per million tokens written to cache (if supported) |
output | Cost per million output tokens |
reasoning | Cost per million reasoning tokens (if separate billing) |
Pricing for tool calls:
| Field | Description |
|---|---|
webSearch | Cost per million web search calls (if supported) |
Example
List Modes
Retrieve information about all available chat modes, including their model counts, thinking effort, and descriptions.Response
Returns a JSON object mapping mode IDs to their configuration:Mode Properties
Display name of the mode.
User-facing description of when to use this mode.
Number of AI models used in parallel for this mode.
Priority rank of the mode. Higher values indicate more powerful modes. The
auto mode has rank -1 as it dynamically
selects other modes.The thinking effort level for this mode:
| Level | Description |
|---|---|
minimal | Quick, direct responses with minimal deliberation |
low | Light reasoning for straightforward tasks |
medium | Balanced thinking for standard work |
high | Deep reasoning for complex problems |
extra-high | Maximum reasoning for mission-critical tasks |
Token budget allocated for extended thinking. Models with thinking capabilities use this budget to reason through problems before responding.
| Effort | Budget |
|---|---|
minimal | 256 tokens |
low | 1,024 tokens |
medium | 2,048 tokens |
high | 3,072 tokens |
extra-high | 4,096 tokens |
Example
Mode Selection Guide
auto
auto
Best for: Most use casesThe
auto mode intelligently analyzes your prompt and selects the optimal mode. It
starts with minimal resources and escalates when complexity is detected. This is the recommended default for most
applications.fast
fast
Best for: Trivial tasks, typo fixes, simple formattingSingle model with minimal thinking. Use when speed is
critical and the task is straightforward. Examples: spell checking, simple code formatting, basic calculations.
thinking
thinking
Best for: Standard development workThree models with medium thinking effort. The default for most coding tasks,
debugging, feature implementation, and analysis. Balances quality with response time.
deep-thinking
deep-thinking
Best for: Complex multi-faceted problemsSix models with high thinking effort. Use for architectural decisions,
system design, complex debugging, and problems requiring parallel exploration of multiple approaches.
pro
pro
Best for: High-stakes, mission-critical workNine models with extra-high thinking effort. Reserved for legal,
financial, medical, or regulatory work where maximum accuracy and verification are essential.
image
image
Best for: Visual content creationSpecialized mode for generating and editing images. Optimized for interpreting
visual descriptions and producing high-quality imagery.
Error Responses
| Status | Description |
|---|---|
401 | Unauthorized - missing or invalid API key |