Evaluation reports
Generate structured PDF or CSV reports from experiment results following the EvalCards standard.
Evaluation reports
Once you've run experiments, you can generate structured reports from the results. Reports follow the EvalCards standard, so they work as formal AI evaluation documentation.
Generating a report
- Go to the Reports tab in your project.
- Click Generate report to open the configuration modal.
- Give your report a title. The project name is pre-filled as a default.
- Choose a format: PDF for a full document or CSV for raw data.
- Select which completed experiments to include.
- Pick the sections you want (evaluation context, metric results, safety assessment and more).
- Click Generate. The report may take up to a minute to produce.
Report sections
The configuration modal shows a checklist of sections. Each section can be toggled on or off:
- Executive summary: Overall scores, pass/fail verdict, key findings
- Evaluation context: Project, organization, evaluator and date
- Model under test: Provider, model ID and generation parameters
- Evaluation setup: Dataset, judge model, metrics and thresholds
- Metric results: Per-metric scores grouped by quality and safety
- Safety and compliance: Bias, toxicity and hallucination analysis
- Sample-level details: Per-sample scores table (off by default, increases file size)
- Arena comparison: Head-to-head results if you've run arena battles (off by default)
- Limitations and recommendations: Auto-generated suggestions based on failing metrics
Viewing and downloading reports
PDF reports open in an inline viewer right on the page. From the viewer toolbar you can download the file or open it in a new browser tab. CSV reports download automatically.
All generated reports appear in a history table below the generate button. The table shows the report name, type of report, project/organization, date generated and who generated it. Click any row to view a PDF or download a CSV.
Deleting reports
Click the trash icon next to any report in the history table. You'll be asked to confirm before the report is permanently removed.