Universal Audio API
The Universal Audio API provides a single set of endpoints for all your speech and audio processing needs, from high-fidelity TTS to accurate transcriptions.
Unified Endpoints
All audio requests are routed through the following specialized endpoints:
1. Text to Speech (TTS)
Endpoint: POST https://audio.siraya.pro/v1/audio/speech
Generates lifelike audio from input text.
2. Audio Translation
Endpoint: POST https://audio.siraya.pro/v1/audio/translations
Translates audio from any supported language into English text.
3. Audio Transcription
Endpoint: POST https://audio.siraya.pro/v1/speech/transcriptions
Transcribes spoken audio into text in the input language.
Usage Example
import requests
url = "https://audio.siraya.pro/v1/audio/speech"
data = {
"model": "gpt-4o-mini-tts",
"input": "Building the future of audio.",
"voice": "alloy"
}
headers = {"Authorization": "Bearer <API_KEY>"}
response = requests.post(url, headers=headers, json=data)
with open("output.mp3", "wb") as f:
f.write(response.content)
For detailed parameters and model lists, refer to the Speech to Text and Text to Speech sections in the Generative Model API.