LLM Evals

Leaderboard

View model performance rankings based on arena comparison results.

The leaderboard is planned for a future release. The sidebar item is currently hidden in the UI; this page documents the expected behaviour when the feature ships.

What the leaderboard will do

The leaderboard will rank models based on their performance in arena comparisons. Every time you run a head-to-head battle in the Arena, results will feed into a ranking table so you can see which models perform best over time.

Planned features

  • Organization-wide rankings from all arena comparisons
  • Win rate, total comparisons and average scores per model
  • Performance tracking as you add more arena battles over time
  • Data you can reference in compliance documentation to justify model selection
Once available, run arena comparisons across different prompt types to get a complete picture. A model that handles coding well might not rank the same on creative tasks.
PreviousPlayground
NextEvaluation reports
Leaderboard - LLM Evals - VerifyWise User Guide