Usage & Billing
Monitor token usage and costs across your Diosc deployment.
Usage Dashboard
Navigate to Usage in the sidebar to open the usage dashboard.

Overview Metrics
The top of the page shows four key metric cards:
| Metric | Description |
|---|---|
| Total Cost | Total cost in USD for the selected period |
| Avg Cost per Session | Average cost per chat session |
| Input Tokens | Total input tokens consumed (with percentage) |
| Output Tokens | Total output tokens generated (with percentage) |
Filtering
Use the filters at the top of the page to narrow the data:
Assistant Filter — Select a specific assistant or All Assistants to see aggregated data.
Date Range — Choose from preset date ranges:
| Option | Description |
|---|---|
| Today | Current day |
| Yesterday | Previous day |
| Last 7 Days | Rolling 7-day window |
| Last 30 Days | Rolling 30-day window (default) |
| This Month | Current calendar month |
| Last Month | Previous calendar month |
Charts & Visualizations
Cost Over Time
A line chart showing cost and total tokens over the selected period. This helps identify usage spikes and trends.

Token Distribution
A bar chart showing the breakdown of input tokens vs. output tokens over time.

Cost by Model
A pie chart showing how costs are distributed across different LLM models. Hover over segments to see exact amounts.
Model Details Table
| Column | Description |
|---|---|
| Model | LLM model name |
| Cost | Total cost for this model |
| Tokens | Total tokens consumed |
| Percentage | Share of total usage |
| Calls | Number of interactions |
Cost Breakdown
A detailed card showing:
- Input costs — Cost of input tokens
- Output costs — Cost of output tokens
- Total interactions — Number of LLM calls
- Total sessions — Number of chat sessions
Billing Dashboard
The Billing page provides a deeper breakdown of resource costs across all resource types.

Summary Cards
| Card | Description |
|---|---|
| Total Cost | Total cost with interaction count |
| Cost by Type | Breakdown by LLM, Embedding, and Plugin with percentages |
| Avg Cost per Session | Average session cost |
Charts
- Cost Trend — Line chart showing cost over time
- Cost by Resource Type — Pie chart breaking down LLM, Embedding, and Plugin costs
- Top Resources by Cost — Bar chart of the top 10 most expensive resources
Detailed Breakdown Table
| Column | Description |
|---|---|
| Resource | Resource name |
| Type | LLM, Embedding, or Plugin |
| Quantity | Usage count |
| Cost | Total cost |
| % of Total | Percentage of total spending |
Filtering
- Period — Today, Yesterday, Last 7 Days, Last 30 Days, This Month, Last Month
- Resource Type — All, LLM, Embedding, Plugin
Resource Pricing Configuration
DioscHub tracks costs for three resource types — LLM, Embedding, and Plugin — each with configurable pricing.
Pricing Models
| Model | Description | Example |
|---|---|---|
| Tiered | Volume-based pricing with tiers (default) | First 1M tokens free, then $3/M tokens |
| Flat | Single rate per unit | $0.003 per 1M tokens |
| Subscription | Fixed cost per billing period | $100/month flat |
Default Model Pricing
DioscHub ships with default pricing for common models. These can be customized in Global Settings > Resource Pricing:
| Model | Input ($/M tokens) | Output ($/M tokens) |
|---|---|---|
| Claude Sonnet 4 | $3.00 | $15.00 |
| Claude Opus 4 | $15.00 | $75.00 |
| Claude Haiku 3.5 | $0.80 | $4.00 |
| GPT-4o | $2.50 | $10.00 |
| GPT-4o Mini | $0.15 | $0.60 |
Configuring Custom Pricing
Navigate to Global Settings > Resource Pricing or use the API:
POST /api/admin/resource-config
Authorization: Bearer {token}
{
"resourceType": "llm",
"resourceId": "anthropic:claude-sonnet-4-20250514",
"displayName": "Claude Sonnet 4",
"unitType": "token",
"hasInputOutput": true,
"pricingMechanism": "tiered",
"pricingTiers": [
{ "upTo": 1000000, "pricePerUnit": 0 },
{ "upTo": null, "pricePerUnit": 3.0 }
],
"outputPricingTiers": [
{ "upTo": 1000000, "pricePerUnit": 0 },
{ "upTo": null, "pricePerUnit": 15.0 }
],
"resetPolicy": "monthly",
"resetDayOfMonth": 1
}
Prices are per million units. A tier with "upTo": null covers all remaining usage.
Seeding Defaults
To populate pricing for all standard models at once:
POST /api/admin/resource-config/seed-defaults
Authorization: Bearer {token}
Quota Management
Set soft and hard usage limits per resource to control spending.
Soft Limits (Warnings)
Soft limits send alerts when usage approaches a threshold. They do not block usage:
{
"softLimitQuantity": 10000000,
"softLimitCostUsd": 50.00,
"alertWebhookUrl": "https://your-system.com/alerts",
"alertEmails": ["billing@company.com"]
}
When a soft limit is reached, DioscHub sends a webhook notification and logs an alert.
Hard Limits (Blocking)
Hard limits block further usage when the threshold is reached. The API returns 429 Too Many Requests:
{
"hardLimitQuantity": 50000000,
"hardLimitCostUsd": 200.00
}
Hard limits block AI tool execution. Set them carefully to avoid disrupting production workflows.
Billing Periods
Quotas reset automatically based on the configured policy:
| Reset Policy | Description |
|---|---|
| None | Never resets (cumulative) |
| Daily | Resets every day at the configured hour |
| Weekly | Resets on the configured day and hour |
| Monthly | Resets on the configured day of month |
| Yearly | Resets on January 1st |
Checking Quota Status
GET /api/admin/resource-usage/llm/anthropic:claude-sonnet-4-20250514/quota
Authorization: Bearer {token}
Response:
{
"resourceType": "llm",
"resourceId": "anthropic:claude-sonnet-4-20250514",
"currentQuantity": 7500000,
"currentCost": 22.50,
"softLimitQuantity": 10000000,
"softLimitCost": 50.00,
"hardLimitQuantity": 50000000,
"hardLimitCost": 200.00,
"percentUsed": 15,
"isAtSoftLimit": false,
"isAtHardLimit": false,
"billingPeriodStart": "2026-02-01T00:00:00Z",
"billingPeriodEnd": "2026-03-01T00:00:00Z"
}
Exporting Usage Data
Usage export requires the Enterprise tier.
Export usage data as CSV or PDF for compliance, accounting, or internal reporting:
GET /api/admin/resource-usage/export?format=csv&startDate=2026-01-01&endDate=2026-01-31
Authorization: Bearer {token}
Supported formats:
- CSV — Summary by resource type + detailed line items
- PDF — Formatted report with charts and top resources breakdown
Query parameters:
| Parameter | Description |
|---|---|
format | csv or pdf |
assistantId | Filter by assistant (optional) |
resourceType | Filter by llm, embedding, or plugin (optional) |
startDate | Start of period (ISO 8601) |
endDate | End of period (ISO 8601) |
Best Practices
Monitor Regularly
- Check the dashboard periodically to spot unusual patterns
- Compare usage across assistants to identify optimization opportunities
- Watch for cost spikes that may indicate issues
Optimize Costs
- Use Budget-tier models for simple, high-volume tasks
- Use Premium models only where complex reasoning is needed
- Review the Model Details table to identify cost-heavy models
- Check the Cost by Model chart to see if cheaper models could handle the workload
Set Appropriate Quotas
- Start with soft limits to understand usage patterns before enforcing hard limits
- Set soft limits at 80% of your budget to get early warnings
- Use per-model hard limits to prevent runaway costs on expensive models
- Review quota alerts regularly and adjust limits as usage patterns stabilize
Track Per-Assistant
- Use the assistant filter to compare costs between assistants
- Identify which assistants consume the most resources
- Adjust model selection per assistant based on complexity needs
Next Steps
- Global Settings — Configure resource pricing and plugins
- Monitoring — Audit logs and system health