Parameters

Sampling parameters shape the token generation process of the model. You may send any parameters from the following list to the SIRAYA Model Router API.

SIRAYA Model Router will default to standard values if certain parameters are absent (e.g., temperature defaults to 1.0). Backend-specific parameters must be passed via extra_body — they cannot be placed at the top level of the request body.

Please refer to the Available Models section to confirm which parameters are supported by each model.

Core Parameters

Temperature

Key: temperature
Type: Float (0.0 to 2.0)
Default: 1.0

Influences the variety in the model's responses. Lower values lead to more predictable responses, while higher values encourage diversity. At 0, the model becomes deterministic (same response for same input).

Top P

Key: top_p
Type: Float (0.0 to 1.0)
Default: 1.0

Limits the model's choices to a percentage of likely tokens (Nucleus sampling). Only the top tokens whose cumulative probability adds up to P are considered.

N

Key: n
Type: Integer (1 or above)
Default: 1

Number of chat completion choices to generate for each input message.

Max Tokens

Key: max_tokens
Type: Integer (1 or above)

Sets the upper limit for the number of tokens the model can generate. Deprecated — use max_completion_tokens instead.

Max Completion Tokens

Key: max_completion_tokens
Type: Integer (1 or above)

Maximum generation token count (OpenAI recommended). Takes precedence over max_tokens if both are set.

Stream

Key: stream
Type: Boolean
Default: false

Enables streaming responses via Server-Sent Events. See the Streaming Guide for details.

Stream Options

Key: stream_options
Type: Object

Streaming configuration. Use {"include_usage": true} to include token usage in the final streaming chunk.

User

Key: user
Type: String

A unique identifier representing your end-user, which can help monitor and detect abuse.

Penalties and Bias

Frequency Penalty

Key: frequency_penalty
Type: Float (-2.0 to 2.0)
Default: 0.0

Penalizes tokens based on their frequency in the text so far. Encourages the model to use less frequent tokens.

Presence Penalty

Key: presence_penalty
Type: Float (-2.0 to 2.0)
Default: 0.0

Penalizes tokens based on whether they have already appeared in the text. Encourages the model to talk about new topics.

Logit Bias

Key: logit_bias
Type: Map (Token ID to Bias Value -100 to 100)

Modifies the likelihood of specific tokens appearing in the completion.

Output Structure

Response Format

Key: response_format
Type: Object (e.g., { "type": "json_object" })

Forces the model to produce a specific output format. Setting to json_object enables JSON mode.

Stop

Key: stop
Type: Array of Strings (up to 4)

Immediately stop generation if any of the specified tokens are encountered.

Seed

Key: seed
Type: Integer

Used for deterministic sampling. Requests with the same seed and parameters should return similar results.

Logprobs

Key: logprobs
Type: Boolean
Default: false

Whether to return token log probabilities of output tokens.

Top Logprobs

Key: top_logprobs
Type: Integer (0-20)

Number of top log probabilities to return. Requires logprobs=true.

Tools

Tools and Tool Choice

Key: tools, tool_choice
Type: Array (Tools), String or Object (Tool Choice)

Enables tool calling following the OpenAI specification. See the Tool Calling Guide for details.

Parallel Tool Calls

Key: parallel_tool_calls
Type: Boolean

Whether to allow parallel tool calls.

Reasoning

Reasoning Effort

Key: reasoning_effort
Type: String (low, medium, high)

Controls reasoning effort level. Supported by OpenAI o1/o3 and Gemini 2.5 series.

Reasoning

Key: reasoning
Type: Object

Reasoning config object (OpenRouter compatible). If both reasoning and reasoning_effort are set, reasoning.effort takes precedence.

Field	Type	Description
`effort`	string	Effort level: `xhigh`, `high`, `medium`, `low`, `minimal`, `none`
`max_tokens`	integer	Direct token budget for reasoning
`exclude`	boolean	If true, model reasons internally but doesn't return reasoning
`enabled`	boolean	Explicitly enable/disable reasoning

Thinking (Extended Thinking)

Key: thinking
Type: Object

Configuration for extended thinking. Can be passed as a top-level field or via extra_body. Supported by Claude 3.7+ and Gemini 2.5 series.

Field	Type	Description
`type`	string	`enabled` or `disabled`
`budget_tokens`	integer	Maximum token count for thinking (minimum 1024)

{
  "thinking": {
    "type": "enabled",
    "budget_tokens": 10000
  }
}

Web Search

Web Search Options

Key: web_search_options
Type: Object

Enables web search for supported models. Pass empty {} to enable with defaults.

Field	Type	Description
`search_context_size`	string	Search context amount: `low`, `medium` (default), `high`
`user_location`	object	User location for localized search results
`user_location.approximate.city`	string	City name (e.g., `"Beijing"`)
`user_location.approximate.country`	string	ISO 3166-1 country code (e.g., `"CN"`)
`user_location.approximate.timezone`	string	IANA timezone (e.g., `"Asia/Shanghai"`)

{
  "web_search_options": {
    "search_context_size": "high",
    "user_location": {
      "type": "approximate",
      "approximate": {
        "city": "Beijing",
        "country": "CN",
        "timezone": "Asia/Shanghai"
      }
    }
  }
}

Routing

Provider

Key: provider
Type: Object

Routing preferences (OpenRouter compatible). Controls load balancing vs sort mode.

Field	Type	Description
`sort`	string	Sort backends by `price`, `throughput`, or `latency`
`buffer`	float	Tolerance range for sorting; backends within range are equivalent (default: 0.1)
`order`	array	Explicit backend order
`require_parameters`	array	Required backend capabilities (e.g. `["tools", "streaming"]`)
`allow_fallbacks`	boolean	Whether to allow backup backends (default: `true`)
`only`	array	Whitelist of backend slugs to use
`ignore`	array	Blacklist of backend slugs to exclude

Transforms

Key: transforms
Type: Array of Strings

Message transforms. Supported: ["middle-out"] to compress prompts exceeding context size. Set to [] to disable.

Backend-Specific Parameters

Extra Body

Key: extra_body
Type: Object (map[string]any)

Pass-through parameters for the backend. Backend-specific parameters (e.g., guardrail or safety-setting options) must be passed via extra_body — they cannot be placed at the top level of the request body.

{
  "extra_body": {
    "thinking": {
      "type": "enabled",
      "budget_tokens": 10000
    }
  }
}