音频输入

Knox Chat 支持通过 API 将音频文件发送给兼容的模型。本指南将介绍如何使用我们的 API 处理音频。

注意：音频文件必须经过 base64 编码 — 不支持直接使用音频 URL。

音频输入

可以通过 /v1/chat/completions API 使用 input_audio 内容类型，将音频文件发送给兼容的模型。音频文件必须经过 base64 编码并包含格式说明。请注意，只有具备音频处理能力的模型才能处理这些请求。

推荐模型：

google/gemini-2.5-pro
google/gemini-2.5-flash
google/gemini-2.5-flash-lite

发送音频文件

以下是发送音频文件进行处理的方式：

TypeScript
Python

import fs from "fs/promises";

async function encodeAudioToBase64(audioPath: string): Promise<string> {
  const audioBuffer = await fs.readFile(audioPath);
  return audioBuffer.toString("base64");
}

// Read and encode the audio file
const audioPath = "path/to/your/audio.wav";
const base64Audio = await encodeAudioToBase64(audioPath);

const response = await fetch("https://api.knox.chat/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer <KNOXCHAT_API_KEY>`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "google/gemini-2.5-flash",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Please transcribe this audio file.",
          },
          {
            type: "input_audio",
            input_audio: {
              data: base64Audio,
              format: "wav",
            },
          },
        ],
      },
    ],
  }),
});

const data = await response.json();
console.log(data);

import requests
import json
import base64

def encode_audio_to_base64(audio_path):
    with open(audio_path, "rb") as audio_file:
        return base64.b64encode(audio_file.read()).decode('utf-8')

url = "https://api.knox.chat/v1/chat/completions"
headers = {
    "Authorization": f"Bearer {KNOXCHAT_API_KEY}",
    "Content-Type": "application/json"
}

# Read and encode the audio file
audio_path = "path/to/your/audio.wav"
base64_audio = encode_audio_to_base64(audio_path)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Please transcribe this audio file."
            },
            {
                "type": "input_audio",
                "input_audio": {
                    "data": base64_audio,
                    "format": "wav"
                }
            }
        ]
    }
]

payload = {
    "model": "google/gemini-2.5-flash",
    "messages": messages
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())

支持的音频格式：

wav
mp3

音频输入​

发送音频文件​

音频输入

发送音频文件