Usage Accounting
Siraya AI provides transparent tracking of model usage, token counts, and associated costs. Our Usage Accounting features allow you to monitor your credit consumption programmatically.
Token Counting
By default, Siraya AI returns token counts in the usage field of the API response.
- Native Tokenization: Costs and billing are always calculated using the model provider's native tokenizer.
- Normalized Metrics: For convenience, some responses may also include model-agnostic token counts for cross-model comparisons.
Enabling Detailed Usage
To include detailed cost information in your response, set the usage parameter:
Response Format
The usage object in the response body includes:
| Field | Description |
|---|---|
prompt_tokens |
Tokens sent in the request. |
completion_tokens |
Tokens generated by the model. |
total_tokens |
The sum of prompt and completion tokens. |
cost |
(Optional) The total amount charged for the request in credits. |
Usage in Streaming
For streaming requests, usage statistics can be requested via stream_options:
{
"model": "gpt-4o",
"messages": [...],
"stream": true,
"stream_options": {
"include_usage": true
}
}
When enabled, the final message in the SSE stream will contain the complete usage statistics for the request.
Billing Transparency
You can view your real-time balance and usage history in the Dashboard. All charges are based on the specific model's price per 1M tokens as listed in our Pricing Page.