API 参考

Knox Chat 的请求和响应模式与 OpenAI Chat API 非常相似，仅有少量差异。从整体来看，Knox Chat 在不同模型和提供商之间标准化了 schema，因此您只需要学习一套接口。

Base URL

所有 API 请求应发送到以下 Base URL：

https://api.knox.chat/v1

身份验证

每个 API 请求都需要在请求头中包含您的 API key：

Authorization: Bearer sk-...

您可以在 Knox Chat Token 页面查找或创建 API key。

主要端点

Knox Chat API 包含以下主要端点：

文本生成

文本补全 - 向所选模型发送文本补全请求
对话补全 - 向所选模型发送对话补全请求

模型信息

获取可用模型列表 - 获取所有可用模型的信息

参数

Knox Chat 支持多种参数来控制模型行为和输出。详情请参阅 Parameters 章节。

错误处理

API 使用标准 HTTP 状态码来表示请求的结果：

200 - 请求成功
400 - 请求参数无效或缺失
401 - 身份验证失败（API key 无效或已过期）
402 - 账户余额不足
404 - 请求的资源未找到
429 - 超出速率限制
500 - 服务器内部错误

请求

输出生成请求格式

以下是请求 schema 的 TypeScript 类型定义。当您向 /v1/chat/completions 端点发送 POST 请求时，它将作为请求体使用（参见上方的快速开始示例）。

有关完整的参数列表，请参阅 Parameters 章节。

// Definitions of subtypes are below
type Request = {
  // Either "messages" or "prompt" is required
  messages?: Message[];
  prompt?: string;

  // If "model" is unspecified, uses the user's default
  model?: string; // See "Supported Models" section

  // Allows to force the model to produce specific output format.
  // See models page and note on this docs page for which models support it.
  response_format?: { type: 'json_object' };

  stop?: string | string[];
  stream?: boolean; // Enable streaming

  // See LLM Parameters (docs.knox.chat/api-reference/parameters)
  max_tokens?: number; // Range: [1, context_length)
  temperature?: number; // Range: [0, 2]

  // Tool calling
  // Will be passed down as-is for providers implementing OpenAI's interface.
  // For providers with custom interfaces, we transform and map the properties.
  // Otherwise, we transform the tools into a YAML template. The model responds with an assistant message.
  tools?: Tool[];
  tool_choice?: ToolChoice;

  // Advanced optional parameters
  seed?: number; // Integer only
  top_p?: number; // Range: (0, 1]
  top_k?: number; // Range: [1, Infinity) Not available for OpenAI models
  frequency_penalty?: number; // Range: [-2, 2]
  presence_penalty?: number; // Range: [-2, 2]
  repetition_penalty?: number; // Range: (0, 2]
  logit_bias?: { [key: number]: number };
  top_logprobs: number; // Integer only
  min_p?: number; // Range: [0, 1]
  top_a?: number; // Range: [0, 1]

  // Reduce latency by providing the model with a predicted output
  // https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs
  prediction?: { type: 'content'; content: string };

  // Knox Chat-only parameters
  // See "Prompt Transforms" section: docs.knox.chat/message-transforms
  transforms?: string[];
  // See "Model Routing" section: docs.knox.chat/model-routing
  models?: string[];
  route?: 'fallback';
  // See "Provider Routing" section: docs.knox.chat/provider-routing
  provider?: ProviderPreferences;
};

// Subtypes:

type TextContent = {
  type: 'text';
  text: string;
};

type ImageContentPart = {
  type: 'image_url';
  image_url: {
    url: string; // URL or base64 encoded image data
    detail?: string; // Optional, defaults to "auto"
  };
};

type ContentPart = TextContent | ImageContentPart;

type Message =
  | {
      role: 'user' | 'assistant' | 'system';
      // ContentParts are only for the "user" role:
      content: string | ContentPart[];
      // If "name" is included, it will be prepended like this
      // for non-OpenAI models: `{name}: {content}`
      name?: string;
    }
  | {
      role: 'tool';
      content: string;
      tool_call_id: string;
      name?: string;
    };

type FunctionDescription = {
  description?: string;
  name: string;
  parameters: object; // JSON Schema object
};

type Tool = {
  type: 'function';
  function: FunctionDescription;
};

type ToolChoice =
  | 'none'
  | 'auto'
  | {
      type: 'function';
      function: {
        name: string;
      };
    };

response_format 参数确保您从大语言模型（LLM）接收到结构化响应。此参数仅受 OpenAI 模型、Nitro 模型以及其他部分模型支持。

:::info 非标准参数如果所选模型不支持某个请求参数（例如，非 OpenAI 模型中的 logit_bias，或 OpenAI 中的 top_k），该参数将被忽略。其余参数将被转发到底层模型 API。 :::

预填充助手消息

Knox Chat 支持让模型续写部分响应。这可以用来引导模型以特定方式回答。

要使用此功能，只需在 messages 数组的末尾添加一条 role: "assistant" 的消息。

TypeScript
fetch('https://api.knox.chat/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <KNOXCHAT_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openai/gpt-5.2',
    messages: [
      { role: 'user', content: 'What is the meaning of life?' },
      { role: 'assistant', content: "I'm not sure, but my best guess is" },
    ],
  }),
});

图像与多模态

多模态请求只能通过 /v1/chat/completions API 实现，需要多部分的 messages 参数。image_url 可以是 URL 或 Base64 编码的图像数据。

"messages": [
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "What's in this image?"
      },
      {
        "type": "image_url",
        "image_url": {
          "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
        }
      }
    ]
  }
]

LLM 响应示例：

{
  "choices": [
    {
      "role": "assistant",
      "content": "This image depicts a scenic natural landscape featuring a long wooden boardwalk that stretches out through an expansive field of green grass. The boardwalk provides a clear path and invites exploration through the lush environment. The scene is surrounded by a variety of shrubbery and trees in the background, indicating a diverse plant life in the area."
    }
  ]
}

图像生成

部分模型支持原生图像生成能力。要生成图像，您可以在请求中添加 modalities: ["image", "text"]。模型将以 OpenAI ContentPartImage 格式返回图像，其中 image_url 包含 Base64 data URL。

{
  "model": "openai/dall-e-3",
  "messages": [
    {
      "role": "user",
      "content": "Create a beautiful sunset over mountains"
    }
  ],
  "modalities": ["image", "text"]
}

图像生成响应示例：

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": [
          {
            "type": "text",
            "text": "Here's your requested sunset over mountains."
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,..."
            }
          }
        ]
      }
    }
  ]
}

上传 Base64 编码图像

对于本地存储的图像，您可以通过 Base64 编码将其发送给模型。以下是示例：

import { readFile } from "fs/promises";

const getFlowerImage = async (): Promise<string> => {
  const imagePath = new URL("flower.jpg", import.meta.url);
  const imageBuffer = await readFile(imagePath);
  const base64Image = imageBuffer.toString("base64");
  return `data:image/jpeg;base64,${base64Image}`;
};

...

"messages": [
  {
    role: "user",
    content: [
      {
        type: "text",
        text: "What's in this image?",
      },
      {
        type: "image_url",
        image_url: {
          url: `${await getFlowerImage()}`,
        },
      },
    ],
  },
];

发送 Base64 编码数据字符串时，请确保包含图像的 content-type。示例：

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII

支持的图像类型：

image/png
image/jpeg
image/webp

响应

CompletionsResponse 格式

Knox Chat 在不同模型和提供商之间标准化了 schema，以符合 OpenAI Chat API 规范。

这意味着 choices 始终是一个数组，即使模型仅返回单个补全结果。如果请求了流式响应，每个 choice 将包含一个 delta 属性；否则将包含一个 message 属性。这使得对所有模型使用相同的代码变得更加容易。

响应 schema 的 TypeScript 类型定义如下：

// Definitions of subtypes are below
type Response = {
  id: string;
  // Depending on whether you set "stream" to "true" and
  // whether you passed in "messages" or a "prompt", you
  // will get a different output shape
  choices: (NonStreamingChoice | StreamingChoice | NonChatChoice)[];
  created: number; // Unix timestamp
  model: string;
  object: 'chat.completion' | 'chat.completion.chunk';

  system_fingerprint?: string; // Only present if the provider supports it

  // Usage data is always returned for non-streaming.
  // When streaming, you will get one usage object at
  // the end accompanied by an empty choices array.
  usage?: ResponseUsage;
};

// If the provider returns usage, we pass it down
// as-is. Otherwise, we count using the GPT-4 tokenizer.

type ResponseUsage = {
  /** Including images and tools if any */
  prompt_tokens: number;
  /** The tokens generated */
  completion_tokens: number;
  /** Sum of the above two fields */
  total_tokens: number;
};

// Subtypes:
type NonChatChoice = {
  finish_reason: string | null;
  text: string;
  error?: ErrorResponse;
};

type NonStreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  message: {
    content: string | null;
    role: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type StreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  delta: {
    content: string | null;
    role?: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type ErrorResponse = {
  code: number; // See "Error Handling" section
  message: string;
  metadata?: Record<string, unknown>; // Contains additional error information such as provider details, the raw error message, etc.
};

type ToolCall = {
  id: string;
  type: 'function';
  function: FunctionCall;
};

示例：

{
  "id": "gen-xxxxxxxxxxxxxx",
  "choices": [
    {
      "finish_reason": "stop", // Normalized finish_reason
      "native_finish_reason": "stop", // The raw finish_reason from the provider
      "message": {
        // will be "delta" if streaming
        "role": "assistant",
        "content": "Hello there!"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 4,
    "total_tokens": 4
  },
  "model": "anthropic/claude-sonnet-4.6" // Could also be "anthropic/claude-2.1", etc, depending on the "model" that ends up being used
}

完成原因

Knox Chat 将每个模型的 finish_reason 标准化为以下值之一：tool_calls、stop、length、content_filter 或 error。

部分模型和提供商可能包含额外的补全原因。模型返回的原始 finish_reason 字符串可通过 native_finish_reason 属性访问。

查询费用与统计

输出生成/补全 API 响应中返回的 token 计数并非使用模型的原生分词器计算，而是使用标准化的、与模型无关的计数方式（通过 GPT-5.2 分词器实现）。这是因为某些提供商无法可靠地返回原生 token 计数。但是，这种情况正变得越来越少见，未来我们可能会在响应对象中添加原生 token 计数。

余额使用和模型定价基于原生 token 计数（而非 API 响应中返回的"标准化" token 计数）。

Base URL​

身份验证​

主要端点​

文本生成​

模型信息​

参数​

错误处理​

请求​

输出生成请求格式​

预填充助手消息​

图像与多模态​

图像生成​

上传 Base64 编码图像​

响应​

CompletionsResponse 格式​

完成原因​

查询费用与统计​