AI Gateway

Analytics

Monitor LLM usage, costs, and guardrail activity across all providers.

Overview

The Analytics page shows LLM usage, costs and guardrail activity across your organization. Every request through the AI Gateway is tracked with cost, token count, latency and model.

Summary cards

4 stat cards at the top show metrics for the selected time period:

Total cost: Combined spend across all endpoints and providers
Total requests: Number of completion and embedding requests processed
Total tokens: Combined prompt and completion tokens across all requests
Avg latency: Average round-trip time from request to complete response

Time period selector

Use the dropdown in the top right to switch between Last 24 hours, 7 days, 30 days, and 90 days. Your selection is saved and persists across sessions. When "Last 24 hours" is selected, the cost chart shows hourly bars instead of a daily trend line.

Cost over time

For "Last 24 hours", a bar chart shows cost per hour. For longer periods (7d, 30d, 90d), a line chart shows the daily cost trend. Hover over any data point to see the exact cost.

Cost by model

A horizontal bar chart showing spend per LLM model. If you run multiple models for different tasks, check whether a lighter model for simpler tasks could cut costs.

Cost by endpoint

A ranked list showing spend and request volume per endpoint. Each entry shows the endpoint name, request count and total cost.

Top users

The top 10 users ranked by spend, with request count, token usage and cost per user.

Guardrails activity

This section appears when guardrail rules are active and have triggered during the selected period. It shows:

Blocked: Number of requests rejected by guardrail rules
Masked: Number of requests where content was redacted before reaching the LLM
A breakdown by guardrail type (PII detection vs content filter) and action taken

Request logs

Click "Load logs" at the bottom of the page to view recent requests with full details. Each row shows the endpoint, model, user, tokens, cost and status code. Click a row to expand it and see:

Request: The full message array sent to the LLM (JSON)
Response: The model's response text
Error: Error message if the request failed
Metadata: Custom tags attached to the request (e.g., department, project)
Latency, prompt tokens and completion tokens

Privacy note

Request logs include the full prompt text. If prompts contain sensitive information, consider your organization's data handling policies before sharing analytics access broadly.

Metadata tags

API callers can attach metadata to requests (e.g., {"department": "engineering", "project": "chatbot"}). This metadata is stored in the spend log and visible in expanded request details. A "Cost by tag" API endpoint is available for programmatic tag-based analytics.

Compliance evidence

Analytics data can be used as compliance evidence for EU AI Act Article 12 (record-keeping) and ISO 42001 Clause 9 (performance evaluation). Every request is logged with timestamps, model, cost and guardrail results.

Logs

Inspect individual requests with full prompt and response details.

Guardrails

Configure the PII and content filter rules that appear in guardrails activity.

Endpoints

Manage the endpoints that generate the cost and usage data shown here.

PreviousGetting started

NextEndpoints