Analytics
Monitor LLM usage, costs, and guardrail activity across all providers.
Overview
The Analytics page shows LLM usage, costs, and guardrail activity across your organization. Every request through the AI Gateway is tracked with cost, token count, latency, and model.
Summary cards
4 stat cards at the top show metrics for the selected time period:
- Total cost: Combined spend across all endpoints and providers
- Total requests: Number of completion and embedding requests processed
- Total tokens: Combined prompt and completion tokens across all requests
- Avg latency: Average round-trip time from request to complete response
Time period selector
Use the dropdown in the top right to switch between Today, 7 days, 30 days, and 90 days. Your selection is saved and persists across sessions. When "Today" is selected, the cost chart shows hourly bars instead of a daily trend line.
Cost over time
For "Today", a bar chart shows cost per hour across all 24 hours of the day. For longer periods (7d, 30d, 90d), a line chart shows the daily cost trend. Hover over any data point to see the exact cost.
Cost by model
A horizontal bar chart showing spend per LLM model. If you run multiple models for different tasks, check whether a lighter model for simpler tasks could cut costs.
Cost by endpoint
A ranked list showing spend and request volume per endpoint. Each entry shows the endpoint name, request count, and total cost.
Top users
The top 10 users ranked by spend, with request count, token usage, and cost per user.
Guardrails activity
This section appears when guardrail rules are active and have triggered during the selected period. It shows:
- Blocked: Number of requests rejected by guardrail rules
- Masked: Number of requests where content was redacted before reaching the LLM
- A breakdown by guardrail type (PII detection vs content filter) and action taken
Request logs
Click "Load logs" at the bottom of the page to view recent requests with full details. Each row shows the endpoint, model, user, tokens, cost, and status code. Click a row to expand it and see:
- Request: The full message array sent to the LLM (JSON)
- Response: The model's response text
- Error: Error message if the request failed
- Metadata: Custom tags attached to the request (e.g., department, project)
- Latency, prompt tokens, and completion tokens
Metadata tags
API callers can attach metadata to requests (e.g., {"department": "engineering", "project": "chatbot"}). This metadata is stored in the spend log and visible in expanded request details. A "Cost by tag" API endpoint is available for programmatic tag-based analytics.