Parameters
Sampling parameters shape the token generation process of the model. You may send any parameters from the following list to the Siraya AI API.
Siraya AI will default to standard values if certain parameters are absent (e.g., temperature defaults to 1.0). Provider-specific parameters, such as safe_prompt for Mistral or raw_mode for Hyperbolic, are passed directly to the respective providers if specified.
Please refer to the Available Models section to confirm which parameters are supported by each model.
Core Parameters
Temperature
- Key:
temperature - Type: Float (0.0 to 2.0)
- Default: 1.0
Influences the variety in the model's responses. Lower values lead to more predictable responses, while higher values encourage diversity. At 0, the model becomes deterministic (same response for same input).
Top P
- Key:
top_p - Type: Float (0.0 to 1.0)
- Default: 1.0
Limits the model's choices to a percentage of likely tokens (Nucleus sampling). Only the top tokens whose cumulative probability adds up to P are considered.
Top K
- Key:
top_k - Type: Integer (0 or above)
- Default: 0 (disabled)
Limits the model's choice to the top K most likely tokens at each step.
Max Tokens
- Key:
max_tokens - Type: Integer (1 or above)
Sets the upper limit for the number of tokens the model can generate.
Penalties and Bias
Frequency Penalty
- Key:
frequency_penalty - Type: Float (-2.0 to 2.0)
- Default: 0.0
Penalizes tokens based on their frequency in the text so far. Encourages the model to use less frequent tokens.
Presence Penalty
- Key:
presence_penalty - Type: Float (-2.0 to 2.0)
- Default: 0.0
Penalizes tokens based on whether they have already appeared in the text. Encourages the model to talk about new topics.
Repetition Penalty
- Key:
repetition_penalty - Type: Float (0.0 to 2.0)
- Default: 1.0
Reduces the repetition of tokens from the prompt or previous output.
Logit Bias
- Key:
logit_bias - Type: Map (Token ID to Bias Value -100 to 100)
Modifies the likelihood of specific tokens appearing in the completion.
Advanced Sampling
Min P
- Key:
min_p - Type: Float (0.0 to 1.0)
- Default: 0.0
The minimum probability for a token to be considered, relative to the most likely token.
Top A
- Key:
top_a - Type: Float (0.0 to 1.0)
- Default: 0.0
Considers only the top tokens with "sufficiently high" probabilities based on the maximum probability (Dynamic Top-P).
Seed
- Key:
seed - Type: Integer
Used for deterministic sampling. Requests with the same seed and parameters should return similar results.
Output Structure
Response Format
- Key:
response_format - Type: Object (e.g.,
{ "type": "json_object" })
Forces the model to produce a specific output format. Setting to json_object enables JSON mode.
Structured Outputs
- Key:
structured_outputs - Type: Boolean
Whether to use json_schema for structured outputs if supported by the model.
Stop
- Key:
stop - Type: Array of Strings (up to 4)
Immediately stop generation if any of the specified tokens are encountered.
Tools and Tool Choice
- Key:
tools,tool_choice - Type: Array (Tools), String or Object (Tool Choice)
Enables tool calling following the OpenAI specification. See the Tool Calling Guide for details.