Overview
Siraya AI provides a unified interface for interacting with various Large Language Models (LLMs). Our API is designed to be compatible with the OpenAI SDK, making it easy to integrate into existing workflows.
Quick start
Using the OpenAI SDK
from openai import OpenAI
client = OpenAI(
base_url="https://llm.siraya.pro/v1",
api_key="<API_KEY>",
)
completion = client.chat.completions.create(
model="claude-3-5-sonnet@20240620",
messages=[
{
"role": "user",
"content": "What is the meaning of life?"
}
]
)
print(completion.choices[0].message.content)
import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'https://llm.siraya.pro/v1',
apiKey: '<API_KEY>',
});
async function main() {
const completion = await openai.chat.completions.create({
model: 'claude-3-5-sonnet@20240620',
messages: [
{
role: 'user',
content: 'What is the meaning of life?',
},
],
});
console.log(completion.choices[0].message);
}
main();
Using the Siraya AI API directly
import requests
import json
response = requests.post(
url="https://llm.siraya.pro/v1/chat/completions",
headers={
"Authorization": "Bearer <API_KEY>",
"Content-Type": "application/json"
},
data=json.dumps({
"model": "claude-3-5-sonnet@20240620",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
]
})
)
print(response.json()["choices"][0]["message"]["content"])
Requests
Completions Request Format
The request body should be a JSON object containing the model and messages array. For a complete list of parameters, see the Parameters Documentation.
Headers
To authenticate, include your API key in the Authorization header as a Bearer token.
Assistant Prefill
Siraya AI supports asking models to complete a partial response. This can be useful for guiding models to respond in a certain way. To use this feature, simply include a message with role: "assistant" at the end of your messages array.
Responses
Completions Response Format
Siraya AI normalizes the response schema across all supported models and providers to comply with the OpenAI Chat API. This ensures that choices is always an array, and each choice contains a message property (or delta for streaming).
Finish Reason
The finish_reason is normalized to one of the following:
* stop: Model reached a natural stopping point.
* length: Model reached the max_tokens limit.
* tool_calls: Model called a tool.
* content_filter: Content was omitted due to a flag from our filters.
* error: An error occurred during generation.
Querying Cost and Stats
The response body (for non-streaming requests) includes a usage field with token counts.
Note: Siraya AI uses a normalized, model-agnostic token count (via GPT-4o tokenizer) in the API response. However, actual billing and model pricing are based on the native token counts provided by the model provider.