Messages

POST https://api.knox.chat/v1/messages

使用 Anthropic Messages API 格式创建消息。此端点与 Claude Code 和 Anthropic SDK 客户端完全兼容。

请求

此端点需要一个包含以下属性的对象：

请求头

Name	Type	Required	Description
Authorization	String	Yes	Bearer 认证，格式为 `Bearer token`，其中 `token` 是您的授权令牌。也可以使用 `x-api-key` 头部。
x-api-key	String	No	使用 Anthropic 风格 API 密钥头部的替代认证方式。
anthropic-version	String	No	Anthropic API 版本（例如 `2023-06-01`）。可选但建议提供。

请求体

Name	Type	Required	Description
model	String	Yes	要使用的模型。示例：`anthropic/claude-sonnet-4.6`、`anthropic/claude-opus-4.6`、`anthropic/claude-haiku-4.5`，或别名如 `sonnet`、`opus`、`haiku`。
messages	Array	Yes	表示对话的消息对象数组。
max_tokens	Integer	Yes	要生成的最大 token 数量（必填）。
system	String or Array	No	系统提示词。可以是字符串或带有缓存控制的内容块数组。
metadata	Object	No	请求元数据，包含可选的 `user_id`。
stop_sequences	Array of Strings	No	自定义停止序列，当模型生成这些序列时将停止生成。
stream	Boolean	No	使用 SSE 启用流式响应。默认为 `false`。
temperature	Double	No	采样温度（范围：[0.0, 1.0]）。
top_k	Integer	No	Top-k 采样值。
top_p	Double	No	Top-p（核采样）值（范围：(0, 1]）。
web_search	Boolean	No	启用或禁用自动网络搜索。默认为 `true`。设置为 `false` 可禁用网络搜索。启用后，模型可以搜索网络获取最新信息。按每 1,000 次搜索调用 $10 计费。
tools	Array	No	用于函数调用的工具定义数组。
tool_choice	Object	No	模型应如何使用工具：`{"type": "auto"}`、`{"type": "any"}` 或 `{"type": "tool", "name": "..."}`。
thinking	Object	No	Claude 3.5+ 的扩展思考配置。使用 `{"type": "enabled", "budget_tokens": 1024}`。

Message 对象

Name	Type	Required	Description
role	String	Yes	消息作者的角色：`user` 或 `assistant`。
content	String or Array	Yes	消息的内容。可以是字符串或内容块数组。

内容块类型

Text 块

{
  "type": "text",
  "text": "Your text content here",
  "cache_control": {"type": "ephemeral"}
}

Image 块

{
  "type": "image",
  "source": {
    "type": "base64",
    "media_type": "image/png",
    "data": "base64-encoded-image-data"
  }
}

Tool Use 块（在 assistant 消息中）

{
  "type": "tool_use",
  "id": "tool_call_id",
  "name": "function_name",
  "input": {"param": "value"}
}

Tool Result 块（在 user 消息中）

{
  "type": "tool_result",
  "tool_use_id": "tool_call_id",
  "content": "Result of the tool call",
  "is_error": false
}

Tool 定义

Name	Type	Required	Description
name	String	Yes	工具/函数的名称。
description	String	No	描述工具的功能。
input_schema	Object	Yes	定义工具参数的 JSON Schema 对象。

Cache Control（Prompt 缓存）

Knox 支持 Anthropic 的 prompt 缓存功能，以降低重复提示词的成本。缓存控制可以应用于文本和图片内容块。

Name	Type	Required	Description
type	String	Yes	缓存类型。使用 `"ephemeral"`。
ttl	String	No	存活时间。默认为 5 分钟。使用 `"1h"` 设置 1 小时的 TTL。

支持的缓存位置

缓存控制可以添加到：

系统提示词（字符串或内容块）
用户消息文本块
用户消息图片块

缓存断点

在提示词中添加 cache_control 来标记缓存断点。断点之前的内容将被缓存并在后续请求中重复使用。

{
  "system": [
    {
      "type": "text",
      "text": "Long reference documentation that should be cached...",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Large context to cache...",
          "cache_control": {"type": "ephemeral"}
        },
        {
          "type": "text",
          "text": "Your actual question (not cached)"
        }
      ]
    }
  ]
}

图片缓存

图片也可以被缓存，这在对同一张图片提出多个问题时非常有用：

{
  "role": "user",
  "content": [
    {
      "type": "image",
      "source": {
        "type": "base64",
        "media_type": "image/png",
        "data": "base64-encoded-image-data"
      },
      "cache_control": {"type": "ephemeral"}
    },
    {
      "type": "text",
      "text": "What's in this image?"
    }
  ]
}

缓存定价

Token Type	Cost Multiplier
cache_creation_input_tokens	1.25x（比常规输入贵 25%）
cache_read_input_tokens	0.1x（比常规输入便宜 90%）

缓存 TTL 选项

TTL	Duration	Use Case
`{"type": "ephemeral"}`	5 分钟	短对话、快速追问
`{"type": "ephemeral", "ttl": "1h"}`	1 小时	较长会话、文档分析

最佳实践

策略性地放置缓存断点：缓存大型静态内容，如文档、代码文件或参考资料。
按稳定性排序内容：将最稳定的内容放在最前面（系统提示词），然后是缓存的用户内容，最后是动态查询。
最小 token 阈值：对于至少包含 1,024 个可缓存 token 的提示词，缓存效果最佳。
在 TTL 窗口内重复使用：在 TTL 窗口内发送后续请求以利用缓存读取。

cURL 示例

基本请求

curl -X POST https://api.knox.chat/v1/messages \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -H "anthropic-version: 2023-06-01" \
     -d '{
  "model": "anthropic/claude-sonnet-4.6",
  "max_tokens": 1024,
  "messages": [
    {"role": "user", "content": "Hello, Claude!"}
  ]
}'

带系统提示词

curl -X POST https://api.knox.chat/v1/messages \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -d '{
  "model": "anthropic/claude-sonnet-4.6",
  "max_tokens": 1024,
  "system": "You are a helpful coding assistant.",
  "messages": [
    {"role": "user", "content": "Write a Python hello world program"}
  ]
}'

带 Prompt 缓存

curl -X POST https://api.knox.chat/v1/messages \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -d '{
  "model": "anthropic/claude-sonnet-4.6",
  "max_tokens": 1024,
  "system": [
    {
      "type": "text",
      "text": "Very long system prompt or documentation to cache...",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [
    {"role": "user", "content": "Question about the cached content"}
  ]
}'

带工具使用

curl -X POST https://api.knox.chat/v1/messages \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -d '{
  "model": "anthropic/claude-sonnet-4.6",
  "max_tokens": 1024,
  "tools": [
    {
      "name": "get_weather",
      "description": "Get the current weather for a location",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "City and state, e.g. San Francisco, CA"
          }
        },
        "required": ["location"]
      }
    }
  ],
  "messages": [
    {"role": "user", "content": "What is the weather in Tokyo?"}
  ]
}'

禁用网络搜索

curl -X POST https://api.knox.chat/v1/messages \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -d '{
  "model": "anthropic/claude-sonnet-4.6",
  "max_tokens": 1024,
  "web_search": false,
  "messages": [
    {"role": "user", "content": "最近有什么新闻？"}
  ]
}'

注意： 所有 Anthropic 模型默认启用网络搜索。当查询需要实时信息时，模型会自动搜索网络。网络搜索按每 1,000 次搜索调用 $10 计费，费用在标准 token 费用之外额外收取。

流式请求

curl -X POST https://api.knox.chat/v1/messages \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -d '{
  "model": "anthropic/claude-sonnet-4.6",
  "max_tokens": 1024,
  "stream": true,
  "messages": [
    {"role": "user", "content": "Tell me a short story"}
  ]
}'

响应

成功响应 (200)

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! I'm Claude, an AI assistant. How can I help you today?"
    }
  ],
  "model": "anthropic/claude-sonnet-4.6",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 25,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0
  }
}

带工具使用的响应

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "tool_use",
      "id": "toolu_01A09q90qw90lq917835lhl",
      "name": "get_weather",
      "input": {"location": "Tokyo, Japan"}
    }
  ],
  "model": "anthropic/claude-sonnet-4.6",
  "stop_reason": "tool_use",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 50,
    "output_tokens": 35
  }
}

流式响应

当 stream: true 时，响应将以 Server-Sent Events (SSE) 形式发送：

event: message_start
data: {"type":"message_start","message":{"id":"msg_01...","type":"message","role":"assistant","content":[],"model":"anthropic/claude-sonnet-4.6","stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":12,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"!"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":25}}

event: message_stop
data: {"type":"message_stop"}

响应结构

Name	Type	Description
id	String	消息的唯一标识符。
type	String	固定为 `"message"`。
role	String	固定为 `"assistant"`。
content	Array	内容块数组（text、tool_use 或 thinking）。
model	String	生成响应的模型。
stop_reason	String	停止原因：`"end_turn"`、`"max_tokens"`、`"stop_sequence"` 或 `"tool_use"`。
stop_sequence	String or null	导致模型停止的停止序列（如适用）。
usage	Object	Token 使用信息。

Usage 对象

Name	Type	Description
input_tokens	Integer	处理的输入 token 数量。
output_tokens	Integer	生成的输出 token 数量。
cache_creation_input_tokens	Integer	写入缓存的 token 数量（1.25 倍费用）。仅在使用 prompt 缓存时出现。
cache_read_input_tokens	Integer	从缓存读取的 token 数量（0.1 倍费用）。仅在使用 prompt 缓存时出现。

错误响应

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid request format: missing required field 'messages'"
  }
}

错误类型

Type	Description
invalid_request_error	请求格式错误或缺少必填字段。
authentication_error	API 密钥无效或缺失。
permission_error	API 密钥无权访问请求的模型。
rate_limit_error	请求过多。请降低请求频率。
api_error	发生内部服务器错误。

模型别名

Knox 自动解析模型别名以方便使用：

Alias	Resolved Model
`haiku`	`anthropic/claude-haiku-4.5`
`sonnet`	`anthropic/claude-sonnet-4.6`
`opus`	`anthropic/claude-opus-4.6`
`claude-3-5-sonnet-*`	`anthropic/claude-sonnet-4.6`
`claude-3-5-haiku-*`	`anthropic/claude-haiku-4.5`
`claude-3-5-opus-*`	`anthropic/claude-opus-4.6`

SDK 使用

Python (Anthropic SDK)

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.knox.chat",
    api_key="sk-your-knox-api-key",
)

message = client.messages.create(
    model="anthropic/claude-sonnet-4.6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude!"}
    ]
)

print(message.content[0].text)

JavaScript (Anthropic SDK)

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  baseURL: 'https://api.knox.chat',
  apiKey: 'sk-your-knox-api-key',
});

const message = await client.messages.create({
  model: 'anthropic/claude-sonnet-4.6',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Hello, Claude!' }
  ],
});

console.log(message.content[0].text);

Claude Code 配置

export ANTHROPIC_BASE_URL="https://api.knox.chat"
export ANTHROPIC_AUTH_TOKEN="sk-your-knox-api-key"
export ANTHROPIC_API_KEY=""  # Must be explicitly empty

然后在终端中运行 claude 即可通过 Knox 启动 Claude Code。

Messages

https://api.knox.chat/v1/messages

请求​

请求头​

请求体​

Message 对象​

内容块类型​

Text 块​

Image 块​

Tool Use 块（在 assistant 消息中）​

Tool Result 块（在 user 消息中）​

Tool 定义​

Cache Control（Prompt 缓存）​

支持的缓存位置​

缓存断点​

图片缓存​

缓存定价​

缓存 TTL 选项​

最佳实践​

cURL 示例​

基本请求​

带系统提示词​

带 Prompt 缓存​

带工具使用​

禁用网络搜索​

流式请求​

响应​

成功响应 (200)​

带工具使用的响应​

流式响应​

响应结构​

Usage 对象​

错误响应​

错误类型​

模型别名​

SDK 使用​

Python (Anthropic SDK)​

JavaScript (Anthropic SDK)​

Claude Code 配置​

请求

请求头

请求体