Reasoning

Many modern models, such as gemini-2.5-pro, include internal chain-of-thought or "thinking" processes. SIRAYA Model Router allows you to access and control these reasoning capabilities via the OpenAI Responses API.

Reasoning Effort

Models that support reasoning allow you to specify the "effort level," which determines how much time and compute the model spends on thinking before answering.

Effort Levels

low: Basic reasoning for straightforward logical steps.
medium (default): Standard reasoning for complex tasks.
high: Extensive thinking for challenging problems or code generation.

How to use Reasoning in Requests

Include the reasoning parameter in your request body.

PythoncURL

import requests
import json

response = requests.post(
    "https://llm.siraya.ai/v1/responses",
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer <API_KEY>",
    },
    json={
        "model": "gemini-2.5-pro",
        "input": [
            {
                "type": "message",
                "role": "user",
                "content": "What is 25 * 37?"
            }
        ],
        "reasoning": {"effort": "high"}
    }
)

print(json.dumps(response.json(), indent=2))

curl https://llm.siraya.ai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <API_KEY>" \
  -d '{
    "model": "gemini-2.5-pro",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": "What is 25 * 37?"
      }
    ],
    "reasoning": {"effort": "high"}
  }'

Reasoning Output

When reasoning is enabled, the response includes a reasoning output block alongside the final answer:

{
  "output": [
    {
      "type":"reasoning",
      "id":"rs_1780048369230770380",
      "status":"completed",
      "content": [
        {
          "type":"reasoning_text",
          "text":"Okay, so I've just been presented with a request: to determine the product of 25 and 37. My immediate thought is to recognize this as a ..."
        }
      ]
    },
    {
      "type": "message",
      "id": "msg_1780048369230771800",
      "status":"completed",
      "role":"assistant",
      "content":[
        {
          "type":"output_text",
          "text":"25 * 37 = **925**"
        }
      ]
    }
  ]
}

[!TIP] Use streaming to see the reasoning process in real-time as it's being generated.