datasetactive

MLCommons AILuminate AI Safety Benchmark

MLCommons’ AILuminate benchmark assesses the safety of general-purpose chat models across a broad set of hazard categories, providing standardized, third-party safety grades to complement capability-focused benchmarks.

At a glance

Published

2025

Jurisdiction

Global

More in Datasets and benchmarks

FairFace dataset: balanced face images for race, gender, age bias testing

UCLA • 2021

BIG-bench: Beyond the Imitation Game Benchmark

Google & Contributors • 2023

HELM: Holistic Evaluation of Language Models

Stanford CRFM • 2023

Related resources

EleutherAI LM Evaluation Harness

Assessment and evaluation • EleutherAI

ISO/IEC 25000 - Software Quality Requirements and Evaluation

Assessment and evaluation • ISO/IEC

AI Risk Management Framework

Assessment and evaluation • NIST

Build your AI governance program

VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.

Explore the library Start free trial

MLCommons AILuminate AI Safety Benchmark

Tags

At a glance

More in Datasets and benchmarks

Related resources

Build your AI governance program