Skip to main content

Claude Models Reasoning and Web Search

Knox has backend adapters for the Claude model IDs anthropic/claude-sonnet-4.6, anthropic/claude-opus-4.6, and anthropic/claude-opus-4.7. These adapters can translate Knox request fields into Anthropic adaptive thinking settings and can also inject the Anthropic web-search tool when the request is routed through an AI gateway-backed channel.

Use /v1/chat/completions if you want reasoning content and web citations in the response. /v1/completions is available for compatibility, but it returns plain text only.

Supported adaptive reasoning levels

ModelAllowed reasoning_effort valuesBackend default
anthropic/claude-sonnet-4.6low, medium, highhigh
anthropic/claude-opus-4.6medium, high, maxhigh
anthropic/claude-opus-4.7high, xhigh, maxxhigh

When you send reasoning_effort, Knox translates it into an Anthropic thinking block like this:

{
"thinking": {
"type": "adaptive",
"effort": "high"
}
}

Web search behavior

For Anthropic models routed through Knox channels:

  • web_search defaults to true on both /v1/chat/completions and /v1/completions.
  • Set "web_search": false to disable live search for a request.
  • In chat completions responses, search citations are normalized into message.annotations using OpenAI-style url_citation entries.

Chat Completions Example

This is the best choice when you want both reasoning output and search citations back from the API.

curl https://api.knox.chat/v1/chat/completions \
-H "Authorization: Bearer $KNOXCHAT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"messages": [
{
"role": "user",
"content": "Summarize the latest changes in the EU AI Act and cite sources."
}
],
"reasoning_effort": "high",
"web_search": true,
"max_tokens": 1200
}'

Response shape with citations

{
"choices": [
{
"message": {
"role": "assistant",
"reasoning": "Search current legislative sources, reconcile drafts versus published text, then summarize by compliance impact.",
"content": "The latest updates focus on implementation timing, GPAI obligations, and enforcement milestones.",
"annotations": [
{
"type": "url_citation",
"url_citation": {
"url": "https://example.com/source",
"title": "Source title"
}
}
]
}
}
]
}

Opus example

Use a higher reasoning level for the larger Opus models:

{
"model": "anthropic/claude-opus-4.7",
"messages": [
{
"role": "user",
"content": "Research the current state of battery supply chains and produce a risk briefing with sources."
}
],
"reasoning_effort": "xhigh",
"web_search": true,
"max_tokens": 2000
}

Advanced passthrough: explicit thinking

If you want to control the Anthropic-native payload directly, Knox also accepts a thinking object and forwards it upstream:

{
"model": "anthropic/claude-opus-4.6",
"messages": [
{
"role": "user",
"content": "Evaluate the tradeoffs of building an internal search index versus using managed search."
}
],
"thinking": {
"type": "adaptive",
"effort": "max"
},
"web_search": false
}

Use this only when you want raw Anthropic-style control. For most Knox clients, reasoning_effort is the simpler interface.

Using /v1/completions

Knox also supports legacy prompt-based access for these Claude models:

curl https://api.knox.chat/v1/completions \
-H "Authorization: Bearer $KNOXCHAT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-opus-4.6",
"prompt": "Find the latest SOC 2 guidance updates and summarize the implementation impact.",
"reasoning_effort": "max",
"web_search": true,
"max_tokens": 900
}'

On this route, Knox converts the prompt into chat messages internally and may still perform search and adaptive thinking upstream. The returned payload is converted into text completion format, so choices[].text is preserved, but message.reasoning and message.annotations are not.

Practical guidance

  • Use /v1/chat/completions when you need reasoning output or citations.
  • Set web_search: false if you want Claude reasoning without live search.
  • Prefer reasoning_effort for normal Knox usage, and use thinking only for direct Anthropic-style control.
  • anthropic/claude-opus-4.7 supports the highest default reasoning level in the current backend mapping.