Messages

Endpoint

POST https://llm.siraya.pro/v1/messages

Basic message

Create a non-streaming message.

Example request

PythonTypeScript

import os
import anthropic

client = anthropic.Anthropic(
    api_key="<API_KEY>",
    base_url='https://llm.siraya.pro'
)

message = client.messages.create(
    model='claude-sonnet-4-5@20250929',
    max_tokens=150,
    messages=[
        {
            'role': 'user',
            'content': 'Write a one-sentence bedtime story about a unicorn.'
        }
    ],
    temperature=0.7,
)

print('Response:', message.content[0].text)
print('Usage:', message.usage)

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  "<API_KEY>",
  baseURL: 'https://llm.siraya.pro',
});

const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5@20250929',
  max_tokens: 150,
  messages: [
    {
      role: 'user',
      content: 'Write a one-sentence bedtime story about a unicorn.',
    },
  ],
  temperature: 0.7,
});

console.log('Response:', message.content[0].text);
console.log('Usage:', message.usage);

Response format

{
  "id": "msg_bdrk_01DPGprdvqXyegS3qZVgD3mU",
  "content": [
    {
      "citations": null,
      "text": "A sleepy unicorn with a shimmering silver mane curled up beneath a rainbow, closed her eyes, and dreamed of dancing through clouds made of cotton candy.",
      "type": "text"
    }
  ],
  "model": "claude-sonnet-4-5@20250929",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "input_tokens": 20,
    "output_tokens": 39,
    "server_tool_use": null,
    "service_tier": null,
    "cache_creation": {
      "ephemeral_1h_input_tokens": 0,
      "ephemeral_5m_input_tokens": 0
    }
  }
}

Streaming messages

Create a streaming message that delivers tokens as they are generated.

Example request

PythonTypeScript

import os
import anthropic

client = anthropic.Anthropic(
    api_key="<API_KEY>",
    base_url='https://llm.siraya.pro'
)

with client.messages.stream(
    model='claude-sonnet-4-5@20250929',
    max_tokens=150,
    messages=[
        {
            'role': 'user',
            'content': 'Write a one-sentence bedtime story about a unicorn.'
        }
    ],
    temperature=0.7,
) as stream:
    for text in stream.text_stream:
        print(text, end='', flush=True)

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  "<API_KEY>",
  baseURL: 'https://llm.siraya.pro',
});

const stream = await anthropic.messages.create({
  model: 'claude-sonnet-4-5@20250929',
  max_tokens: 150,
  messages: [
    {
      role: 'user',
      content: 'Write a one-sentence bedtime story about a unicorn.',
    },
  ],
  temperature: 0.7,
  stream: true,
});

for await (const event of stream) {
  if (event.type === 'content_block_delta') {
    if (event.delta.type === 'text_delta') {
      process.stdout.write(event.delta.text);
    }
  }
}

Streaming event types

Streaming responses use Server-Sent Events (SSE). The key event types are:

message_start - Initial message metadata
content_block_start - Start of a content block (text, tool use, etc.)
content_block_delta - Incremental content updates
content_block_stop - End of a content block
message_delta - Final message metadata (stop reason, usage)
message_stop - End of the message