Usage Accounting

Siraya AI provides transparent tracking of model usage, token counts, and associated costs. Our Usage Accounting features allow you to monitor your credit consumption programmatically.

Token Counting

By default, Siraya AI returns token counts in the usage field of the API response.

Native Tokenization: Costs and billing are always calculated using the model provider's native tokenizer.
Normalized Metrics: For convenience, some responses may also include model-agnostic token counts for cross-model comparisons.

Enabling Detailed Usage

To include detailed cost information in your response, set the usage parameter:

{
  "model": "gpt-4o",
  "messages": [...],
  "usage": {
    "include": true
  }
}

Response Format

The usage object in the response body includes:

Field	Description
`prompt_tokens`	Tokens sent in the request.
`completion_tokens`	Tokens generated by the model.
`total_tokens`	The sum of prompt and completion tokens.
`cost`	(Optional) The total amount charged for the request in credits.

Usage in Streaming

For streaming requests, usage statistics can be requested via stream_options:

{
  "model": "gpt-4o",
  "messages": [...],
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}

When enabled, the final message in the SSE stream will contain the complete usage statistics for the request.

Billing Transparency

You can view your real-time balance and usage history in the Dashboard. All charges are based on the specific model's price per 1M tokens as listed in our Pricing Page.