Skip to content

Overview

Siraya AI provides a unified interface for interacting with various Large Language Models (LLMs). Our API is designed to be compatible with the OpenAI SDK, making it easy to integrate into existing workflows.

Quick start

Using the OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="https://llm.siraya.pro/v1",
    api_key="<API_KEY>",
)

completion = client.chat.completions.create(
    model="claude-3-5-sonnet@20240620",
    messages=[
        {
            "role": "user",
            "content": "What is the meaning of life?"
        }
    ]
)

print(completion.choices[0].message.content)
import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'https://llm.siraya.pro/v1',
  apiKey: '<API_KEY>',
});

async function main() {
  const completion = await openai.chat.completions.create({
    model: 'claude-3-5-sonnet@20240620',
    messages: [
      {
        role: 'user',
        content: 'What is the meaning of life?',
      },
    ],
  });

  console.log(completion.choices[0].message);
}

main();

Using the Siraya AI API directly

import requests
import json

response = requests.post(
    url="https://llm.siraya.pro/v1/chat/completions",
    headers={
        "Authorization": "Bearer <API_KEY>",
        "Content-Type": "application/json"
    },
    data=json.dumps({
        "model": "claude-3-5-sonnet@20240620",
        "messages": [
            {
                "role": "user",
                "content": "What is the meaning of life?"
            }
        ]
    })
)

print(response.json()["choices"][0]["message"]["content"])
curl https://llm.siraya.pro/v1/chat/completions \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet@20240620",
    "messages": [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ]
  }'

Requests

Completions Request Format

The request body should be a JSON object containing the model and messages array. For a complete list of parameters, see the Parameters Documentation.

Headers

To authenticate, include your API key in the Authorization header as a Bearer token.

Authorization: Bearer <YOUR_API_KEY>

Assistant Prefill

Siraya AI supports asking models to complete a partial response. This can be useful for guiding models to respond in a certain way. To use this feature, simply include a message with role: "assistant" at the end of your messages array.

Responses

Completions Response Format

Siraya AI normalizes the response schema across all supported models and providers to comply with the OpenAI Chat API. This ensures that choices is always an array, and each choice contains a message property (or delta for streaming).

Finish Reason

The finish_reason is normalized to one of the following: * stop: Model reached a natural stopping point. * length: Model reached the max_tokens limit. * tool_calls: Model called a tool. * content_filter: Content was omitted due to a flag from our filters. * error: An error occurred during generation.

Querying Cost and Stats

The response body (for non-streaming requests) includes a usage field with token counts.

Note: Siraya AI uses a normalized, model-agnostic token count (via GPT-4o tokenizer) in the API response. However, actual billing and model pricing are based on the native token counts provided by the model provider.