datasetactive

BBQ: A Hand-Built Bias Benchmark for Question Answering

BBQ is a hand-built benchmark that measures social bias in question-answering models across nine demographic dimensions, testing how model outputs shift with and without disambiguating context.

At a glance

Published

2022

Jurisdiction

Global

More in Datasets and benchmarks

FairFace dataset: balanced face images for race, gender, age bias testing

UCLA • 2021

BIG-bench: Beyond the Imitation Game Benchmark

Google & Contributors • 2023

HELM: Holistic Evaluation of Language Models

Stanford CRFM • 2023

Related resources

Responsible AI Principles and Approach

Governance frameworks • Microsoft

What is Responsible AI - Azure Machine Learning

Governance frameworks • Microsoft

Responsible AI: Ethical Policies and Practices

Ethics and principles • Microsoft

Build your AI governance program

VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.

Explore the library Start free trial

BBQ: A Hand-Built Bias Benchmark for Question Answering

Tags

At a glance

More in Datasets and benchmarks

Related resources

Build your AI governance program