Audio Inputs

Knox Chat supports sending audio files to compatible models via the API. This guide will show you how to work with audio using our API.

Note: Audio files must be base64-encoded - direct URLs are not supported for audio content.

Audio Inputs

Requests with audio files to compatible models are available via the /v1/chat/completions API with the input_audio content type. Audio files must be base64-encoded and include the format specification. Note that only models with audio processing capabilities will handle these requests.

Recommended Models:

google/gemini-2.5-pro
google/gemini-2.5-flash
google/gemini-2.5-flash-lite
openai/gpt-4o-audio-preview

Sending Audio Files

Here's how to send an audio file for processing:

TypeScript
Python

import fs from "fs/promises";

async function encodeAudioToBase64(audioPath: string): Promise<string> {
  const audioBuffer = await fs.readFile(audioPath);
  return audioBuffer.toString("base64");
}

// Read and encode the audio file
const audioPath = "path/to/your/audio.wav";
const base64Audio = await encodeAudioToBase64(audioPath);

const response = await fetch("https://api.knox.chat/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer <KNOXCHAT_API_KEY>`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "google/gemini-2.5-flash",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Please transcribe this audio file.",
          },
          {
            type: "input_audio",
            input_audio: {
              data: base64Audio,
              format: "wav",
            },
          },
        ],
      },
    ],
  }),
});

const data = await response.json();
console.log(data);

import requests
import json
import base64

def encode_audio_to_base64(audio_path):
    with open(audio_path, "rb") as audio_file:
        return base64.b64encode(audio_file.read()).decode('utf-8')

url = "https://api.knox.chat/v1/chat/completions"
headers = {
    "Authorization": f"Bearer {KNOXCHAT_API_KEY}",
    "Content-Type": "application/json"
}

# Read and encode the audio file
audio_path = "path/to/your/audio.wav"
base64_audio = encode_audio_to_base64(audio_path)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Please transcribe this audio file."
            },
            {
                "type": "input_audio",
                "input_audio": {
                    "data": base64_audio,
                    "format": "wav"
                }
            }
        ]
    }
]

payload = {
    "model": "google/gemini-2.5-flash",
    "messages": messages
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())

Supported audio formats are:

wav
mp3

Audio Inputs​

Sending Audio Files​

Audio Inputs

Sending Audio Files