跳到主要内容

API 参考

Knox Chat 的请求和响应模式与 OpenAI Chat API 非常相似,仅有少量差异。从整体来看,Knox Chat 在不同模型和提供商之间标准化了 schema,因此您只需要学习一套接口。

Base URL

所有 API 请求应发送到以下 Base URL:

https://api.knox.chat/v1

身份验证

每个 API 请求都需要在请求头中包含您的 API key:

Authorization: Bearer sk-...

您可以在 Knox Chat Token 页面查找或创建 API key。

主要端点

Knox Chat API 包含以下主要端点:

文本生成

模型信息

参数

Knox Chat 支持多种参数来控制模型行为和输出。详情请参阅 Parameters 章节。

错误处理

API 使用标准 HTTP 状态码来表示请求的结果:

  • 200 - 请求成功
  • 400 - 请求参数无效或缺失
  • 401 - 身份验证失败(API key 无效或已过期)
  • 402 - 账户余额不足
  • 404 - 请求的资源未找到
  • 429 - 超出速率限制
  • 500 - 服务器内部错误

请求

输出生成请求格式

以下是请求 schema 的 TypeScript 类型定义。当您向 /v1/chat/completions 端点发送 POST 请求时,它将作为请求体使用(参见上方的 快速开始 示例)。

有关完整的参数列表,请参阅 Parameters 章节。

// Definitions of subtypes are below
type Request = {
// Either "messages" or "prompt" is required
messages?: Message[];
prompt?: string;

// If "model" is unspecified, uses the user's default
model?: string; // See "Supported Models" section

// Allows to force the model to produce specific output format.
// See models page and note on this docs page for which models support it.
response_format?: { type: 'json_object' };

stop?: string | string[];
stream?: boolean; // Enable streaming

// See LLM Parameters (docs.knox.chat/api-reference/parameters)
max_tokens?: number; // Range: [1, context_length)
temperature?: number; // Range: [0, 2]

// Tool calling
// Will be passed down as-is for providers implementing OpenAI's interface.
// For providers with custom interfaces, we transform and map the properties.
// Otherwise, we transform the tools into a YAML template. The model responds with an assistant message.
tools?: Tool[];
tool_choice?: ToolChoice;

// Advanced optional parameters
seed?: number; // Integer only
top_p?: number; // Range: (0, 1]
top_k?: number; // Range: [1, Infinity) Not available for OpenAI models
frequency_penalty?: number; // Range: [-2, 2]
presence_penalty?: number; // Range: [-2, 2]
repetition_penalty?: number; // Range: (0, 2]
logit_bias?: { [key: number]: number };
top_logprobs: number; // Integer only
min_p?: number; // Range: [0, 1]
top_a?: number; // Range: [0, 1]

// Reduce latency by providing the model with a predicted output
// https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs
prediction?: { type: 'content'; content: string };

// Knox Chat-only parameters
// See "Prompt Transforms" section: docs.knox.chat/message-transforms
transforms?: string[];
// See "Model Routing" section: docs.knox.chat/model-routing
models?: string[];
route?: 'fallback';
// See "Provider Routing" section: docs.knox.chat/provider-routing
provider?: ProviderPreferences;
};

// Subtypes:

type TextContent = {
type: 'text';
text: string;
};

type ImageContentPart = {
type: 'image_url';
image_url: {
url: string; // URL or base64 encoded image data
detail?: string; // Optional, defaults to "auto"
};
};

type ContentPart = TextContent | ImageContentPart;

type Message =
| {
role: 'user' | 'assistant' | 'system';
// ContentParts are only for the "user" role:
content: string | ContentPart[];
// If "name" is included, it will be prepended like this
// for non-OpenAI models: `{name}: {content}`
name?: string;
}
| {
role: 'tool';
content: string;
tool_call_id: string;
name?: string;
};

type FunctionDescription = {
description?: string;
name: string;
parameters: object; // JSON Schema object
};

type Tool = {
type: 'function';
function: FunctionDescription;
};

type ToolChoice =
| 'none'
| 'auto'
| {
type: 'function';
function: {
name: string;
};
};

response_format 参数确保您从大语言模型(LLM)接收到结构化响应。此参数仅受 OpenAI 模型、Nitro 模型以及其他部分模型支持。

非标准参数

如果所选模型不支持某个请求参数(例如,非 OpenAI 模型中的 logit_bias,或 OpenAI 中的 top_k),该参数将被忽略。 其余参数将被转发到底层模型 API。

预填充助手消息

Knox Chat 支持让模型续写部分响应。这可以用来引导模型以特定方式回答。

要使用此功能,只需在 messages 数组的末尾添加一条 role: "assistant" 的消息。

TypeScript
fetch('https://api.knox.chat/v1/chat/completions', {
method: 'POST',
headers: {
Authorization: 'Bearer <KNOXCHAT_API_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'openai/gpt-5.2',
messages: [
{ role: 'user', content: 'What is the meaning of life?' },
{ role: 'assistant', content: "I'm not sure, but my best guess is" },
],
}),
});

图像与多模态

多模态请求只能通过 /v1/chat/completions API 实现,需要多部分的 messages 参数。image_url 可以是 URL 或 Base64 编码的图像数据。

"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
]

LLM 响应示例:

{
"choices": [
{
"role": "assistant",
"content": "This image depicts a scenic natural landscape featuring a long wooden boardwalk that stretches out through an expansive field of green grass. The boardwalk provides a clear path and invites exploration through the lush environment. The scene is surrounded by a variety of shrubbery and trees in the background, indicating a diverse plant life in the area."
}
]
}

图像生成

部分模型支持原生图像生成能力。要生成图像,您可以在请求中添加 modalities: ["image", "text"]。模型将以 OpenAI ContentPartImage 格式返回图像,其中 image_url 包含 Base64 data URL。

{
"model": "openai/dall-e-3",
"messages": [
{
"role": "user",
"content": "Create a beautiful sunset over mountains"
}
],
"modalities": ["image", "text"]
}

图像生成响应示例:

{
"choices": [
{
"message": {
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's your requested sunset over mountains."
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,..."
}
}
]
}
}
]
}

上传 Base64 编码图像

对于本地存储的图像,您可以通过 Base64 编码将其发送给模型。以下是示例:

import { readFile } from "fs/promises";

const getFlowerImage = async (): Promise<string> => {
const imagePath = new URL("flower.jpg", import.meta.url);
const imageBuffer = await readFile(imagePath);
const base64Image = imageBuffer.toString("base64");
return `data:image/jpeg;base64,${base64Image}`;
};

...

"messages": [
{
role: "user",
content: [
{
type: "text",
text: "What's in this image?",
},
{
type: "image_url",
image_url: {
url: `${await getFlowerImage()}`,
},
},
],
},
];

发送 Base64 编码数据字符串时,请确保包含图像的 content-type。示例:

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII

支持的图像类型:

  • image/png
  • image/jpeg
  • image/webp

响应

CompletionsResponse 格式

Knox Chat 在不同模型和提供商之间标准化了 schema,以符合 OpenAI Chat API 规范。

这意味着 choices 始终是一个数组,即使模型仅返回单个补全结果。如果请求了流式响应,每个 choice 将包含一个 delta 属性;否则将包含一个 message 属性。这使得对所有模型使用相同的代码变得更加容易。

响应 schema 的 TypeScript 类型定义如下:

// Definitions of subtypes are below
type Response = {
id: string;
// Depending on whether you set "stream" to "true" and
// whether you passed in "messages" or a "prompt", you
// will get a different output shape
choices: (NonStreamingChoice | StreamingChoice | NonChatChoice)[];
created: number; // Unix timestamp
model: string;
object: 'chat.completion' | 'chat.completion.chunk';

system_fingerprint?: string; // Only present if the provider supports it

// Usage data is always returned for non-streaming.
// When streaming, you will get one usage object at
// the end accompanied by an empty choices array.
usage?: ResponseUsage;
};
// If the provider returns usage, we pass it down
// as-is. Otherwise, we count using the GPT-4 tokenizer.

type ResponseUsage = {
/** Including images and tools if any */
prompt_tokens: number;
/** The tokens generated */
completion_tokens: number;
/** Sum of the above two fields */
total_tokens: number;
};
// Subtypes:
type NonChatChoice = {
finish_reason: string | null;
text: string;
error?: ErrorResponse;
};

type NonStreamingChoice = {
finish_reason: string | null;
native_finish_reason: string | null;
message: {
content: string | null;
role: string;
tool_calls?: ToolCall[];
};
error?: ErrorResponse;
};

type StreamingChoice = {
finish_reason: string | null;
native_finish_reason: string | null;
delta: {
content: string | null;
role?: string;
tool_calls?: ToolCall[];
};
error?: ErrorResponse;
};

type ErrorResponse = {
code: number; // See "Error Handling" section
message: string;
metadata?: Record<string, unknown>; // Contains additional error information such as provider details, the raw error message, etc.
};

type ToolCall = {
id: string;
type: 'function';
function: FunctionCall;
};

示例:

{
"id": "gen-xxxxxxxxxxxxxx",
"choices": [
{
"finish_reason": "stop", // Normalized finish_reason
"native_finish_reason": "stop", // The raw finish_reason from the provider
"message": {
// will be "delta" if streaming
"role": "assistant",
"content": "Hello there!"
}
}
],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 4,
"total_tokens": 4
},
"model": "anthropic/claude-sonnet-4.6" // Could also be "anthropic/claude-2.1", etc, depending on the "model" that ends up being used
}

完成原因

Knox Chat 将每个模型的 finish_reason 标准化为以下值之一:tool_callsstoplengthcontent_filtererror

部分模型和提供商可能包含额外的补全原因。模型返回的原始 finish_reason 字符串可通过 native_finish_reason 属性访问。

查询费用与统计

输出生成/补全 API 响应中返回的 token 计数并非使用模型的原生分词器计算,而是使用标准化的、与模型无关的计数方式(通过 GPT-5.2 分词器实现)。这是因为某些提供商无法可靠地返回原生 token 计数。但是,这种情况正变得越来越少见,未来我们可能会在响应对象中添加原生 token 计数。

余额使用和模型定价基于原生 token 计数(而非 API 响应中返回的"标准化" token 计数)。