Structured Outputs

Return structured data from your models.

SIRAYA Router supports structured outputs for compatible models, ensuring responses follow a specific schema format. This feature is particularly useful when you need consistent, well-formatted responses that can be reliably parsed by your application.

Structured outputs allow you to:

Enforce specific JSON Schema validation on model responses
Get consistent, type-safe outputs
Avoid parsing errors and hallucinated fields
Simplify response handling in your application

Model Support

Model	`json_object`	`json_schema`
gemini-2.5-pro	✅	✅
gemini-2.5-flash	✅	✅
claude-sonnet-4.5	✅	✅
claude-opus-4.6	✅	✅
gpt-5.4-pro	✅	✅
gpt-4.1-mini	✅	✅
grok-3	✅	✅

Note: DeepSeek models only support json_object mode. When json_schema is passed, the schema constraint will be ignored and the model will behave as json_object.

Using Structured Outputs

To use structured outputs, include a response_format parameter in your request.

json_schema

Set type to json_schema and provide your schema definition. The model will respond with a JSON object that strictly follows your schema.

Request:

curl -s -X POST https://llm.siraya.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <API_KEY>" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      {"role": "user", "content": "Extract info: Alice is 30 years old and lives in Tokyo."}
    ],
    "max_tokens": 200,
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person_info",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
            "city": {"type": "string"}
          },
          "required": ["name", "age", "city"],
          "additionalProperties": false
        }
      }
    }
  }'

Response:

{
  "id": "chatcmpl-abc123def456",
  "object": "chat.completion",
  "created": 1774776734,
  "model": "gemini-2.5-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"name\":\"Alice\",\"age\":30,\"city\":\"Tokyo\"}"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 29,
    "completion_tokens": 124,
    "total_tokens": 153
  }
}
{
  "id": "chatcmpl-abcdefghijklmnopqrstuvwx",
  "object": "chat.completion",
  "created": 1780050978,
  "model": "gemini-2.5-pro",
  "choices": [
    {
      "index":0,
      "message": {
        "role": "assistant",
        "content": "{\n  \"name\": \"Alice\",\n  \"age\": 30,\n  \"city\": \"Tokyo\"\n}",
        "refusal": null,
        "reasoning": "Alright, so the task is to extract specific information about someone named Alice – her name, age, and the city she's in...",
        "reasoning_content":"Alright, so the task is to extract specific information about someone named Alice ...","reasoning_details": [
          {
            "type": "reasoning.text",
            "text": "Alright, so the task is to extract specific information about someone named Alice – her name, age, and the city she's in...",
            "id": "reasoning-text-0",
            "format": "google-gemini-v1",
            "index": 0
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 79,
    "total_tokens": 104,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "cache_write_tokens": 0,
      "text_tokens": 0,
      "audio_tokens": 0,
      "image_tokens": 0,
      "web_search_requests": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 51,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0,
      "text_tokens": 0,
      "image_tokens": 0
    }
  }
}

Complex Nested Schema

Structured outputs support complex schemas with nested objects, arrays, and enums.

Request:

curl -s -X POST https://llm.siraya.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <API_KEY>" \
  -d '{
    "model": "claude-sonnet-4.5",
    "messages": [
      {"role": "user", "content": "Parse this invoice: Customer John Smith (john@example.com) ordered 2x Widget A at $9.99 and 1x Widget B at $24.50. Invoice #INV-2026-001, total $44.48 USD, status paid."}
    ],
    "max_tokens": 500,
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "invoice",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "invoice_id": {"type": "string"},
            "customer": {
              "type": "object",
              "properties": {
                "name": {"type": "string"},
                "email": {"type": "string"}
              },
              "required": ["name", "email"],
              "additionalProperties": false
            },
            "items": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "product": {"type": "string"},
                  "quantity": {"type": "integer"},
                  "unit_price": {"type": "number"}
                },
                "required": ["product", "quantity", "unit_price"],
                "additionalProperties": false
              }
            },
            "total": {"type": "number"},
            "currency": {"type": "string"},
            "status": {"type": "string", "enum": ["paid", "pending", "overdue"]}
          },
          "required": ["invoice_id", "customer", "items", "total", "currency", "status"],
          "additionalProperties": false
        }
      }
    }
  }'

Response:

{
  "id": "chatcmpl-xyz789ghi012",
  "object": "chat.completion",
  "created": 1774776838,
  "model": "claude-sonnet-4.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"currency\":\"USD\",\"customer\":{\"email\":\"john@example.com\",\"name\":\"John Smith\"},\"invoice_id\":\"INV-2026-001\",\"items\":[{\"product\":\"Widget A\",\"quantity\":2,\"unit_price\":9.99},{\"product\":\"Widget B\",\"quantity\":1,\"unit_price\":24.5}],\"status\":\"paid\",\"total\":44.48}"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 843,
    "completion_tokens": 364,
    "total_tokens": 1207
  }
}
{
  "id": "chatcmpl-RCLrho323ltniX9fDNtLEWzU",
  "object": "chat.completion",
  "created": 1780051250,
  "model": "claude-sonnet-4.5",
  "choices": [
    {
      "index":0,
      "message": {
        "role": "assistant",
        "content": "{\"currency\":\"USD\",\"customer\":{\"email\":\"john@example.com\",\"name\":\"John Smith\"},\"invoice_id\":\"INV-2026-001\",\"items\":[{\"product\":\"Widget A\",\"quantity\":2,\"unit_price\":9.99},{\"product\":\"Widget B\",\"quantity\":1,\"unit_price\":24.5}],\"status\":\"paid\",\"total\":44.48}","refusal":null,
        "reasoning": "The user wants me to parse an invoice with the following details:\n- Customer: John Smith (john@example.com)\n- Items: \n  - 2x Widget A at $9.99 each\n  - 1x Widget B at $24.50 each\n- Invoice ID: INV-2026-001\n- Total: $44.48\n- Currency: USD\n- Status: paid\n\nI need to call the json_tool_call function with these details structured according to the schema.",
        "reasoning_content":"The user wants me to parse an invoice with the following details:\n- Customer: John Smith (john@example.com)\n- Items: \n  - 2x Widget A at $9.99 each\n  - 1x Widget B at $24.50 each\n- Invoice ID: INV-2026-001\n- Total: $44.48\n- Currency: USD\n- Status: paid\n\nI need to call the json_tool_call function with these details structured according to the schema.",
        "reasoning_details":[
          {
            "type": "reasoning.text",
            "text": "The user wants me to parse an invoice with the following details:\n- Customer: John Smith (john@example.com)\n- Items: \n  - 2x Widget A at $9.99 each\n  - 1x Widget B at $24.50 each\n- Invoice ID: INV-2026-001\n- Total: $44.48\n- Currency: USD\n- Status: paid\n\nI need to call the json_tool_call function with these details structured according to the schema.",
            "signature": "E...",
            "id":"reasoning-text-0",
            "format":"anthropic-claude-v1",
            "index":0
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 843,
    "completion_tokens": 362,
    "total_tokens": 1205,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "cache_write_tokens": 0,
      "text_tokens": 0,
      "audio_tokens": 0,
      "image_tokens": 0,
      "web_search_requests": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 102,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0,
      "text_tokens": 0,
      "image_tokens": 0
    }
  }
}

json_object

For simpler use cases where you just need valid JSON without strict schema enforcement, use json_object mode.

Request:

curl -s -X POST https://llm.siraya.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <API_KEY>" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      {"role": "user", "content": "Return a JSON object with keys \"greeting\" and \"language\" for: Say hello in French."}
    ],
    "max_tokens": 300,
    "response_format": {"type": "json_object"}
  }'

Response:

{
  "id": "chatcmpl-abcdefghijklmnopqrstuvwx",
  "object": "chat.completion",
  "created": 1780051533,
  "model": "gemini-2.5-pro",
  "choices": [
    {
      "index":0,
      "message": {
        "role": "assistant",
        "content": "{\n  \"greeting\": \"Bonjour\",\n  \"language\": \"French\"\n}",
        "refusal":null,
        "reasoning":"Alright, let's break this down. The user's instruction is crystal clear: they need a JSON object. My immediate ...",
        "reasoning_content":"Alright, let's break this down. The user's instruction is crystal clear: they need a JSON object. My immediate thought is ...",
        "reasoning_details": [
          {
            "type": "reasoning.text",
            "text": "Alright, let's break this down. The user's instruction is crystal clear: they need a JSON object. My immediate thought is to ...",
            "id":"reasoning-text-0",
            "format":"google-gemini-v1",
            "index":0
          }
        ]
      },
      "finish_reason":"stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 85,
    "total_tokens": 105,
    "prompt_tokens_details": {
      "cached_tokens":0,"
      cache_write_tokens": 0,
      "text_tokens": 0,
      "audio_tokens": 0,
      "image_tokens": 0,
      "web_search_requests": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 66,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0,
      "text_tokens": 0,
      "image_tokens": 0
    }
  }
}

Streaming with Structured Outputs

Structured outputs are supported with streaming responses. The model will stream valid partial JSON that, when complete, forms a valid response matching your schema.

To enable streaming with structured outputs, add stream: true to your request:

{
  "model": "gemini-2.5-pro",
  "messages": [{"role": "user", "content": "Extract person info from the text."}],
  "stream": true,
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "person_info",
      "strict": true,
      "schema": { ... }
    }
  }
}

Note: For Claude models, streaming with json_schema uses a "fake stream" approach — the full response is generated first, then delivered as SSE chunks. This ensures schema compliance.

json_schema vs json_object

	`json_object`	`json_schema`
Guarantees valid JSON	✅	✅
Guarantees correct field names	❌	✅
Guarantees correct field types	❌	✅
Guarantees required fields present	❌	✅
Prevents extra fields	❌	✅ (`strict: true`)
Requires prompt guidance	Yes	No (schema is the constraint)

Best Practices

Set sufficient max_tokens: Some models (e.g., Gemini) use reasoning tokens that count toward the limit. Set max_tokens high enough to accommodate both reasoning and the JSON output.
Use strict: true: Always set strict: true in your json_schema to ensure the model follows your schema exactly.
Include additionalProperties: false: Prevents the model from adding unexpected fields to the output.
Include descriptions: Add clear description fields to your schema properties to guide the model.

Error Handling

When using structured outputs, you may encounter these scenarios:

Model doesn't support json_schema: DeepSeek models will ignore the schema and behave as json_object. The request will not fail, but the output may not match your schema.
Truncated JSON: If max_tokens is too low, the JSON output may be truncated (finish_reason: "length"). Increase max_tokens to resolve.
Invalid schema: The model will return an error if your JSON Schema is invalid.