音频输入
Knox Chat 支持通过 API 将音频文件发送给兼容的模型。本指南将介绍如何使用我们的 API 处理音频。
注意:音频文件必须经过 base64 编码 — 不支持直接使用音频 URL。
音频输入
可以通过 /v1/chat/completions API 使用 input_audio 内容类型,将音频文件发送给兼容的模型。音频文件必须经过 base64 编码并包含格式说明。请注意,只有具备音频处理能力的模型才能处理这些请求。
推荐模型:
- google/gemini-2.5-pro
- google/gemini-2.5-flash
- google/gemini-2.5-flash-lite
发送音频文件
以下是发送音频文件进行处理的方式:
- TypeScript
- Python
import fs from "fs/promises";
async function encodeAudioToBase64(audioPath: string): Promise<string> {
const audioBuffer = await fs.readFile(audioPath);
return audioBuffer.toString("base64");
}
// Read and encode the audio file
const audioPath = "path/to/your/audio.wav";
const base64Audio = await encodeAudioToBase64(audioPath);
const response = await fetch("https://api.knox.chat/v1/chat/completions", {
method: "POST",
headers: {
Authorization: `Bearer <KNOXCHAT_API_KEY>`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "google/gemini-2.5-flash",
messages: [
{
role: "user",
content: [
{
type: "text",
text: "Please transcribe this audio file.",
},
{
type: "input_audio",
input_audio: {
data: base64Audio,
format: "wav",
},
},
],
},
],
}),
});
const data = await response.json();
console.log(data);
import requests
import json
import base64
def encode_audio_to_base64(audio_path):
with open(audio_path, "rb") as audio_file:
return base64.b64encode(audio_file.read()).decode('utf-8')
url = "https://api.knox.chat/v1/chat/completions"
headers = {
"Authorization": f"Bearer {KNOXCHAT_API_KEY}",
"Content-Type": "application/json"
}
# Read and encode the audio file
audio_path = "path/to/your/audio.wav"
base64_audio = encode_audio_to_base64(audio_path)
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please transcribe this audio file."
},
{
"type": "input_audio",
"input_audio": {
"data": base64_audio,
"format": "wav"
}
}
]
}
]
payload = {
"model": "google/gemini-2.5-flash",
"messages": messages
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
支持的音频格式:
wavmp3