Skip to main content

Quickstart

Knox Chat: provides a unified API that allows you to access hundreds of AI models through a single endpoint, while automatically handling fallbacks and selecting the most cost-effective options. Our goal is not merely to provide a single API for accessing multiple models, but also to focus on multimodality and enable convenient usage of today's popular open-source AI or agent applications and tools with just one key.

Using the OpenAI SDK

import OpenAI from 'openai';

const openai = new OpenAI({
baseURL: 'https://knox.chat/v1',
apiKey: '<KNOXCHAT_API_KEY>',
});

async function main() {
const completion = await openai.chat.completions.create({
model: 'openai/gpt-5',
messages: [
{
role: 'user',
content: 'What is the meaning of life?',
},
],
});

console.log(completion.choices[0].message);
}

main();

Using the Knox.Chat API directly

import requests
import json

response = requests.post(
url="https://knox.chat/v1/chat/completions",
headers={
"Authorization": "Bearer <KNOXCHAT_API_KEY>",
},
data=json.dumps({
"model": "anthropic/claude-sonnet-4", # Optional
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
]
})
)

The API also supports streaming.

Using third-party SDKs

For information about using third-party SDKs and frameworks with Knox Chat, please see our frameworks documentation.

Principles

Knox Chat helps developers source and optimize AI usage. We believe the future is multimodality and multi-provider.

Why Knox Chat?

Price and Performance. Knox Chat scouts for the best prices, the lowest latencies, and the highest throughput across dozens of providers, and lets you choose how to prioritize them.

Standardized API. No need to change code when switching between models or providers. You can even let your users choose and pay for their own.

Real-World Insights. Be the first to take advantage of new models. See real-world data of how often models are used for different purposes. Keep up to date in our Discord channel.

Consolidated Billing. Simple and transparent billing, regardless of how many providers you use.

Higher Availability. Fallback providers, and automatic, smart routing means your requests still work even when providers go down.

Higher Rate Limits. Knox Chat works directly with providers to provide better rate limits and more throughput.

Models

One API for hundreds of models

Explore and browse 300+ models and providers on our website, or with our API.

Models API Standard

Our Models API makes the most important information about all LLMs freely available as soon as we confirm it.

API Response Schema

The Models API returns a standardized JSON response format that provides comprehensive metadata for each available model. This schema is cached at the edge and designed for reliable integration for production applications.

Root Response Object

{
"data": [
/* Array of Model objects */
]
}

Model Object Schema

Each model in the data array contains the following standardized fields:

FieldTypeDescription
idstringUnique model identifier used in API requests (e.g., "google/gemini-2.5-pro")
objectstringObject type identifier (always "model")
creatednumberUnix timestamp of when the model was created
owned_bystringOrganization that owns the model
permissionModelPermission[]Array of permission objects defining access controls
rootstringRoot model identifier
parentstring | nullParent model identifier if this is a fine-tuned version
context_lengthnumberMaximum context window size in tokens
architectureArchitectureObject describing the model's technical capabilities
pricingPricingPrice structure for using this model
top_providerTopProviderConfiguration details for the primary provider
supported_parametersstring[]Array of supported API parameters for this model

Architecture Object

{
"modality": string, // High-level description of input/output flow (e.g., "text+image->text")
"input_modalities": string[], // Supported input types: ["file", "image", "text", "audio"]
"output_modalities": string[], // Supported output types: ["text"]
"tokenizer": string // Tokenization method used (e.g., "Gemini")
}

Pricing Object

All pricing values are in USD per token/request/unit. A value of "0" indicates the feature is free.

{
"prompt": string, // Cost per input token
"completion": string, // Cost per output token
"request": string, // Fixed cost per API request
"image": string, // Cost per image input
"audio": string, // Cost per audio input
"web_search": string, // Cost per web search operation
"internal_reasoning": string, // Cost for internal reasoning tokens
"input_cache_read": string, // Cost per cached input token read
"input_cache_write": string // Cost per cached input token write
}

Top Provider Object

{
"context_length": number, // Provider-specific context limit
"max_completion_tokens": number // Maximum tokens in response
}

Supported Parameters

The supported_parameters array indicates which OpenAI-compatible parameters work with each model:

  • include_reasoning - Include reasoning in response
  • max_tokens - Response length limiting
  • reasoning - Internal reasoning mode
  • response_format - Output format specification
  • seed - Deterministic outputs
  • stop - Custom stop sequences
  • structured_outputs - JSON schema enforcement
  • temperature - Randomness control
  • tool_choice - Tool selection control
  • tools - Function calling capabilities
  • top_p - Nucleus sampling
Token Counting Differences

Different models use different tokenization methods (as indicated by the tokenizer field in the model schema). Some models break up text into chunks of multiple characters (GPT, Claude, Llama, etc), while others tokenize differently (like Gemini). This means that token counts (and therefore costs) will vary between models, even when inputs and outputs are the same. Costs are displayed and billed according to the tokenizer for the model in use. You can use the usage field in API responses to get the actual token counts for your input and output.

Frequently Asked Questions

Getting started

Models and Providers

API Technical Specifications

Privacy and Data Logging

Please see our Terms of Service and Privacy Policy.