Developer Docs — Lens Audio

Navigation

API Overview

The Lens Audio TTS API provides high-quality text-to-speech synthesis with support for multiple voices, emotion control, and voice cloning.

Base URL: https://audio-chat.ask-lens.ai

Convert text to natural-sounding WAV audio
Stream audio generation via Server-Sent Events
Clone voices using reference audio files
Control speech emotion and speed
Manage API keys and track usage

Authentication

All TTS and voice-clone endpoints require an API key. Pass it using either method:

Bearer Token (Authorization header) ``Authorization: Bearer ak_your_api_key_here``

X-API-Key header ``X-API-Key: ak_your_api_key_here``

Admin endpoints (key management, usage) require the admin secret instead of an API key.

Rate Limits

The API uses queue-based rate limiting. Each TTS request is placed in a processing queue. When the queue is full, requests are rejected with a 429 status.

Use GET /audio/queue-status to check current queue utilization before submitting requests.

json

{
  "queue_size": 12,
  "max_queue_size": 100,
  "utilization": 0.12
}

Error Code Reference

Status	Meaning
400	Bad Request - Missing or invalid parameters
401	Unauthorized - Missing or invalid API key
403	Forbidden - Insufficient permissions (admin endpoints)
404	Not Found - Resource does not exist
429	Too Many Requests - Queue is full, try again later
500	Internal Server Error - Unexpected failure
502	Bad Gateway - Upstream service error (e.g. S3 download failed)
503	Service Unavailable - TTS engine not ready
504	Gateway Timeout - Request timed out

TTS

POST/audio/tts

Text to Speech

Convert text to speech audio. Returns a binary WAV file with metadata in response headers.

Request Body

Name	Type	Required	Default	Description
text	string	Yes	—	The text to synthesize into speech.
voice_id	string	No	zh-Somer	The voice to use for synthesis.
emotion	string	No	calm	Emotion style for the speech. Options: happy, sad, angry, calm, surprised, fearful, disgusted, melancholic.
emo_vector	float[8]	No	—	Custom 8-dimensional emotion vector for fine-grained control. Overrides the emotion parameter when provided.
speed	float	No	1.0	Playback speed multiplier. Range: 0.5 - 2.0.

Response

json

Binary WAV audio data

Response Headers

Header	Description
X-Task-Id	Unique identifier for the TTS task.
X-Audio-Duration	Duration of the generated audio in seconds.
X-Queue-Size	Current queue size at the time of processing.

Error Codes

Status	Body	Description
400	{"error": "Missing text"}	The required text field was not provided.
400	{"error": "Invalid JSON"}	The request body is not valid JSON.
400	{"error": "Token limit exceeded (max 3000)"}	The input text exceeds the 3000 BPE token limit.
429	{"error": "Queue full, try again later"}	The processing queue is at capacity.
500	{"error": "Empty audio returned"}	The TTS engine returned no audio data.
503	{"error": "TTS engine not ready"}	The TTS engine is still initializing.
504	{"error": "TTS request timed out"}	The request exceeded the maximum processing time.

Code Examples

bash

curl -X POST https://audio-chat.ask-lens.ai/audio/tts \
  -H "Authorization: Bearer ak_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello, world!", "voice_id": "zh-Somer", "emotion": "happy"}' \
  --output output.wav

POST/audio/tts/stream

Text to Speech (Streaming)

Stream text-to-speech synthesis via Server-Sent Events. Provides real-time status updates and base64-encoded audio chunks.

Request Body

Name	Type	Required	Default	Description
text	string	Yes	—	The text to synthesize into speech.
voice_id	string	No	zh-Somer	The voice to use for synthesis.
emotion	string	No	calm	Emotion style for the speech. Options: happy, sad, angry, calm, surprised, fearful, disgusted, melancholic.
emo_vector	float[8]	No	—	Custom 8-dimensional emotion vector for fine-grained control. Overrides the emotion parameter when provided.
speed	float	No	1.0	Playback speed multiplier. Range: 0.5 - 2.0.

Response

json

event: queued
data: {"task_id": "abc-123", "queue_position": 3}

event: processing
data: {"task_id": "abc-123"}

event: audio
data: {"task_id": "abc-123", "audio": "<base64-encoded WAV>", "duration": 2.5}

event: done
data: {"task_id": "abc-123"}

GET/audio/voices

List Voices

Retrieve the list of all available voices for TTS synthesis.

Response

json

{
  "voices": [
    {
      "voice_id": "zh-Somer",
      "file": "zh-Somer.wav"
    },
    {
      "voice_id": "zh-Luna",
      "file": "zh-Luna.wav"
    }
  ],
  "default_voice_id": "zh-Somer",
  "count": 2
}

GET/audio/queue-status

Queue Status

Check the current TTS processing queue size and utilization.

Response

json

{
  "queue_size": 12,
  "max_queue_size": 100,
  "utilization": 0.12
}

Voice Clone

POST/voice-clone/register

Register Voice

Request Body

Name	Type	Required	Default	Description
voice_id	string	Yes	—	Unique identifier for the new voice.
s3_url	string	Yes	—	S3 URL of the reference audio file (WAV format recommended).

Response

json

{
  "voice_id": "my-custom-voice",
  "cached": true,
  "file": "my-custom-voice.wav"
}

Error Codes

Status	Body	Description
400	{"error": "Invalid voice_id"}	The voice_id contains invalid characters or is empty.
400	{"error": "Missing s3_url"}	The required s3_url field was not provided.
400	{"error": "Unsupported audio format"}	The reference audio file is not in a supported format.
502	{"error": "S3 download failed"}	Failed to download the reference audio from S3.

DELETE/voice-clone/{voice_id}

Remove Voice

Delete a previously registered cloned voice.

Response

json

{
  "voice_id": "my-custom-voice",
  "deleted": true
}

Error Codes

Status	Body	Description
400	{"error": "Invalid voice_id"}	The voice_id contains invalid characters or is empty.
404	{"error": "Voice not found"}	No voice with the given voice_id exists.

API Key Management

POST/auth/keys

Create API Key

Generate a new API key for a given user.

Request Body

Name	Type	Required	Default	Description
user_id	string	Yes	—	The user identifier to associate with the new key.

Response

json

{
  "user_id": "user-123",
  "api_key": "ak_7a52e8882d90ba41ea9222dab0b972c8650cd5ccf6b19064"
}

Error Codes

Status	Body	Description
400	{"error": "Missing user_id"}	The required user_id field was not provided.
403	{"error": "Unauthorized"}	The request does not have admin privileges.

GET/auth/keys

List API Keys

List all active API keys, optionally filtered by user.

Query Parameters

Name	Type	Required	Default	Description
user_id	string	No	—	Filter keys by user identifier.

Response

json

{
  "keys": [
    {
      "user_id": "user-123",
      "api_key": "ak_7a52e...9064",
      "created_at": "2026-03-10T12:00:00"
    }
  ],
  "count": 1
}

DELETE/auth/keys/by-key

Revoke API Key

Revoke a specific API key.

Request Body

Name	Type	Required	Default	Description
api_key	string	Yes	—	The API key to revoke.

Response

json

{
  "api_key": "ak_7a52e...",
  "revoked": true
}

Error Codes

Status	Body	Description
400	{"error": "Missing api_key"}	The required api_key field was not provided.
403	{"error": "Unauthorized"}	The request does not have admin privileges.
404	{"error": "No active key found"}	The specified API key does not exist or is already revoked.

DELETE/auth/keys/by-user

Revoke All User Keys

Revoke all active API keys for a given user.

Request Body

Name	Type	Required	Default	Description
user_id	string	Yes	—	The user identifier whose keys should be revoked.

Response

json

{
  "user_id": "user-123",
  "revoked_count": 3
}

Error Codes

Status	Body	Description
400	{"error": "Missing user_id"}	The required user_id field was not provided.
403	{"error": "Unauthorized"}	The request does not have admin privileges.
404	{"error": "No active keys found"}	The user has no active keys to revoke.

Usage

GET/auth/usage

Usage Records

Retrieve usage records with optional filtering by user, API key, and time period.

Query Parameters

Name	Type	Required	Default	Description
user_id	string	No	—	Filter records by user identifier.
api_key	string	No	—	Filter records by API key.
period	string	No	—	Time period filter. Options: D (day), W (week), M (month).
limit	number	No	100	Maximum number of records to return.
offset	number	No	0	Number of records to skip for pagination.

Response

json

{
  "records": [
    {
      "user_id": "user-123",
      "api_key": "ak_7a52e...9064",
      "endpoint": "/audio/tts",
      "tokens": 42,
      "audio_duration": 3.2,
      "timestamp": "2026-03-20T14:30:00Z"
    }
  ],
  "summary": {
    "total_requests": 156,
    "total_tokens": 8420,
    "total_audio_duration": 1234.5
  },
  "count": 1,
  "limit": 100,
  "offset": 0
}

Code Examples

完整的熱門程式語言範例。

Python Client

python

import requests

API_KEY = "ak_your_api_key"
BASE_URL = "https://audio-chat.ask-lens.ai"

headers = {"Authorization": f"Bearer {API_KEY}"}

# --- Text to Speech ---
response = requests.post(
    f"{BASE_URL}/audio/tts",
    headers=headers,
    json={
        "text": "Hello, welcome to Lens Audio!",
        "voice_id": "zh-Somer",
        "emotion": "happy",
        "speed": 1.0,
    },
)

if response.status_code == 200:
    with open("output.wav", "wb") as f:
        f.write(response.content)
    print("Audio duration:", response.headers.get("X-Audio-Duration"), "s")
else:
    print("Error:", response.json())

# --- List Voices ---
voices = requests.get(f"{BASE_URL}/audio/voices", headers=headers).json()
print("Available voices:", [v["voice_id"] for v in voices["voices"]])

# --- Queue Status ---
status = requests.get(f"{BASE_URL}/audio/queue-status", headers=headers).json()
print(f"Queue: {status['queue_size']}/{status['max_queue_size']}")

cURL

bash

# Text to Speech
curl -X POST https://audio-chat.ask-lens.ai/audio/tts \
  -H "Authorization: Bearer ak_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello, welcome to Lens Audio!", "voice_id": "zh-Somer", "emotion": "happy"}' \
  --output output.wav

# List Voices
curl https://audio-chat.ask-lens.ai/audio/voices \
  -H "Authorization: Bearer ak_your_api_key"

# Queue Status
curl https://audio-chat.ask-lens.ai/audio/queue-status \
  -H "Authorization: Bearer ak_your_api_key"

# Register Voice Clone
curl -X POST https://audio-chat.ask-lens.ai/voice-clone/register \
  -H "Authorization: Bearer ak_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"voice_id": "my-voice", "s3_url": "https://s3.amazonaws.com/bucket/ref.wav"}'

# Delete Voice Clone
curl -X DELETE https://audio-chat.ask-lens.ai/voice-clone/my-voice \
  -H "Authorization: Bearer ak_your_api_key"

Node.js

javascript

const fs = require("fs");

const API_KEY = "ak_your_api_key";
const BASE_URL = "https://audio-chat.ask-lens.ai";

const headers = {
  "Authorization": `Bearer ${API_KEY}`,
  "Content-Type": "application/json",
};

// --- Text to Speech ---
async function textToSpeech(text, voiceId = "zh-Somer", emotion = "calm") {
  const response = await fetch(`${BASE_URL}/audio/tts`, {
    method: "POST",
    headers,
    body: JSON.stringify({ text, voice_id: voiceId, emotion }),
  });

  if (response.ok) {
    const buffer = await response.arrayBuffer();
    fs.writeFileSync("output.wav", Buffer.from(buffer));
    console.log("Duration:", response.headers.get("X-Audio-Duration"), "s");
  } else {
    const error = await response.json();
    console.error("Error:", error);
  }
}

// --- List Voices ---
async function listVoices() {
  const response = await fetch(`${BASE_URL}/audio/voices`, {
    headers: { "Authorization": `Bearer ${API_KEY}` },
  });
  const data = await response.json();
  console.log("Voices:", data.voices.map((v) => v.voice_id));
}

// --- Queue Status ---
async function queueStatus() {
  const response = await fetch(`${BASE_URL}/audio/queue-status`, {
    headers: { "Authorization": `Bearer ${API_KEY}` },
  });
  const data = await response.json();
  console.log(`Queue: ${data.queue_size}/${data.max_queue_size}`);
}

textToSpeech("Hello, welcome to Lens Audio!", "zh-Somer", "happy");