Skip to main content

Messages

POST 

https://api.knox.chat/v1/messages

Create a message using the Anthropic Messages API format. This endpoint provides full compatibility with Claude Code and Anthropic SDK clients.

Request

This endpoint requires an object containing the following properties:

Headers

NameTypeRequiredDescription
AuthorizationStringYesBearer authentication in the form Bearer token, where token is your authorization token. Alternatively, use x-api-key header.
x-api-keyStringNoAlternative authentication using Anthropic-style API key header.
anthropic-versionStringNoAnthropic API version (e.g., 2023-06-01). Optional but recommended.

Request Body

NameTypeRequiredDescription
modelStringYesThe model to use. Examples: anthropic/claude-sonnet-4.6, anthropic/claude-opus-4.6, anthropic/claude-haiku-4.5, or aliases like sonnet, opus, haiku.
messagesArrayYesArray of message objects representing the conversation.
max_tokensIntegerYesMaximum number of tokens to generate (required).
systemString or ArrayNoSystem prompt. Can be a string or array of content blocks with cache control.
metadataObjectNoRequest metadata containing optional user_id.
stop_sequencesArray of StringsNoCustom stop sequences that will cause the model to stop generating.
streamBooleanNoEnable streaming responses using SSE. Defaults to false.
temperatureDoubleNoSampling temperature (range: [0.0, 1.0]).
top_kIntegerNoTop-k sampling value.
top_pDoubleNoTop-p (nucleus) sampling value (range: (0, 1]).
toolsArrayNoArray of tool definitions for function calling.
tool_choiceObjectNoHow the model should use tools: {"type": "auto"}, {"type": "any"}, or {"type": "tool", "name": "..."}.
thinkingObjectNoExtended thinking configuration for Claude 3.5+. Use {"type": "enabled", "budget_tokens": 1024}.

Message Object

NameTypeRequiredDescription
roleStringYesThe role of the message author: user or assistant.
contentString or ArrayYesThe content of the message. Can be a string or array of content blocks.

Content Block Types

Text Block

{
"type": "text",
"text": "Your text content here",
"cache_control": {"type": "ephemeral"}
}

Image Block

{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": "base64-encoded-image-data"
}
}

Tool Use Block (in assistant messages)

{
"type": "tool_use",
"id": "tool_call_id",
"name": "function_name",
"input": {"param": "value"}
}

Tool Result Block (in user messages)

{
"type": "tool_result",
"tool_use_id": "tool_call_id",
"content": "Result of the tool call",
"is_error": false
}

Tool Definition

NameTypeRequiredDescription
nameStringYesThe name of the tool/function.
descriptionStringNoDescription of what the tool does.
input_schemaObjectYesJSON Schema object defining the tool's parameters.

Cache Control (Prompt Caching)

Knox supports Anthropic's prompt caching feature to reduce costs on repeated prompts. Cache control can be applied to both text and image content blocks.

NameTypeRequiredDescription
typeStringYesCache type. Use "ephemeral".
ttlStringNoTime-to-live. Default is 5 minutes. Use "1h" for 1-hour TTL.

Supported Cache Locations

Cache control can be added to:

  • System prompt (string or content blocks)
  • User message text blocks
  • User message image blocks

Cache Breakpoints

Add cache_control to mark cache breakpoints in your prompt. Content before the breakpoint will be cached and reused in subsequent requests.

{
"system": [
{
"type": "text",
"text": "Long reference documentation that should be cached...",
"cache_control": {"type": "ephemeral"}
}
],
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Large context to cache...",
"cache_control": {"type": "ephemeral"}
},
{
"type": "text",
"text": "Your actual question (not cached)"
}
]
}
]
}

Image Caching

Images can also be cached, which is useful when asking multiple questions about the same image:

{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": "base64-encoded-image-data"
},
"cache_control": {"type": "ephemeral"}
},
{
"type": "text",
"text": "What's in this image?"
}
]
}

Cache Pricing

Token TypeCost Multiplier
cache_creation_input_tokens1.25x (25% more than regular input)
cache_read_input_tokens0.1x (90% discount from regular input)

Cache TTL Options

TTLDurationUse Case
{"type": "ephemeral"}5 minutesShort conversations, quick follow-ups
{"type": "ephemeral", "ttl": "1h"}1 hourLonger sessions, document analysis

Best Practices

  1. Place cache breakpoints strategically: Cache large, static content like documentation, code files, or reference materials.
  2. Order content by stability: Put the most stable content first (system prompt), then cached user content, then dynamic queries.
  3. Minimum token threshold: Caching is most effective for prompts with at least 1,024 tokens of cacheable content.
  4. Reuse within TTL: Make follow-up requests within the TTL window to benefit from cache reads.

cURL Example

Basic Request

curl -X POST https://api.knox.chat/v1/messages \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}'

With System Prompt

curl -X POST https://api.knox.chat/v1/messages \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"max_tokens": 1024,
"system": "You are a helpful coding assistant.",
"messages": [
{"role": "user", "content": "Write a Python hello world program"}
]
}'

With Prompt Caching

curl -X POST https://api.knox.chat/v1/messages \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"max_tokens": 1024,
"system": [
{
"type": "text",
"text": "Very long system prompt or documentation to cache...",
"cache_control": {"type": "ephemeral"}
}
],
"messages": [
{"role": "user", "content": "Question about the cached content"}
]
}'

With Tool Use

curl -X POST https://api.knox.chat/v1/messages \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"max_tokens": 1024,
"tools": [
{
"name": "get_weather",
"description": "Get the current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
],
"messages": [
{"role": "user", "content": "What is the weather in Tokyo?"}
]
}'

Streaming Request

curl -X POST https://api.knox.chat/v1/messages \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"max_tokens": 1024,
"stream": true,
"messages": [
{"role": "user", "content": "Tell me a short story"}
]
}'

Response

Success Response (200)

{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello! I'm Claude, an AI assistant. How can I help you today?"
}
],
"model": "anthropic/claude-sonnet-4.6",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 25,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0
}
}

Response with Tool Use

{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lhl",
"name": "get_weather",
"input": {"location": "Tokyo, Japan"}
}
],
"model": "anthropic/claude-sonnet-4.6",
"stop_reason": "tool_use",
"stop_sequence": null,
"usage": {
"input_tokens": 50,
"output_tokens": 35
}
}

Streaming Response

When stream: true, the response is sent as Server-Sent Events (SSE):

event: message_start
data: {"type":"message_start","message":{"id":"msg_01...","type":"message","role":"assistant","content":[],"model":"anthropic/claude-sonnet-4.6","stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":12,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"!"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":25}}

event: message_stop
data: {"type":"message_stop"}

Response Schema

NameTypeDescription
idStringUnique identifier for the message.
typeStringAlways "message".
roleStringAlways "assistant".
contentArrayArray of content blocks (text, tool_use, or thinking).
modelStringThe model that generated the response.
stop_reasonStringReason for stopping: "end_turn", "max_tokens", "stop_sequence", or "tool_use".
stop_sequenceString or nullThe stop sequence that caused the model to stop, if applicable.
usageObjectToken usage information.

Usage Object

NameTypeDescription
input_tokensIntegerNumber of input tokens processed.
output_tokensIntegerNumber of output tokens generated.
cache_creation_input_tokensIntegerTokens written to cache (1.25x cost). Only present when using prompt caching.
cache_read_input_tokensIntegerTokens read from cache (0.1x cost). Only present when using prompt caching.

Error Response

{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "Invalid request format: missing required field 'messages'"
}
}

Error Types

TypeDescription
invalid_request_errorThe request was malformed or missing required fields.
authentication_errorInvalid or missing API key.
permission_errorThe API key doesn't have access to the requested model.
rate_limit_errorToo many requests. Please slow down.
api_errorAn internal server error occurred.

Model Aliases

Knox automatically resolves model aliases for convenience:

AliasResolved Model
haikuanthropic/claude-haiku-4.5
sonnetanthropic/claude-sonnet-4.6
opusanthropic/claude-opus-4.6
claude-3-5-sonnet-*anthropic/claude-sonnet-4.6
claude-3-5-haiku-*anthropic/claude-haiku-4.5
claude-3-5-opus-*anthropic/claude-opus-4.6

SDK Usage

Python (Anthropic SDK)

import anthropic

client = anthropic.Anthropic(
base_url="https://api.knox.chat",
api_key="sk-your-knox-api-key",
)

message = client.messages.create(
model="anthropic/claude-sonnet-4.6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude!"}
]
)

print(message.content[0].text)

JavaScript (Anthropic SDK)

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
baseURL: 'https://api.knox.chat',
apiKey: 'sk-your-knox-api-key',
});

const message = await client.messages.create({
model: 'anthropic/claude-sonnet-4.6',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude!' }
],
});

console.log(message.content[0].text);

Claude Code Configuration

export ANTHROPIC_BASE_URL="https://api.knox.chat"
export ANTHROPIC_AUTH_TOKEN="sk-your-knox-api-key"
export ANTHROPIC_API_KEY="" # Must be explicitly empty

Then run claude in your terminal to start Claude Code with Knox.