AfriMMLU (IrokoBench)

Datasets

Language / NLP

Docs live

Human-translated multiple-choice question-answering evaluation benchmark covering 16 to 17 African languages plus English and French, derived from a subset of MMLU across subjects like maths, geography and law. Distributed as CSV and parquet on HuggingFace and forms part of the IrokoBench suite (MMLU, MGSM, XNLI). Licensed Apache 2.0.

Website

Category: Datasets
Pricing: Free / open
Country: 🌍 Pan-African
Last verified: 5 Jul 2026

Compare AfriMMLU (IrokoBench)

Side-by-side, verified specs against its closest language / nlp alternatives.

AfriMMLU (IrokoBench) vs AfriQA AfriMMLU (IrokoBench) vs Hausa Visual Genome (HausaVG)AfriMMLU (IrokoBench) vs MasakhaNER 2.0 AfriMMLU (IrokoBench) vs MasakhaNEWS

Related in Datasets

Hausa Visual Genome (HausaVG)

Datasets

Multimodal Hausa-English dataset of 32,923 images with paired English/Hausa region descriptions (train/dev/test/challenge splits), post-edited by HausaNLP and Bayero University Kano translators for English-to-Hausa machine translation and image description.

Docs live

Language / NLP

Verified Jul 2026Free / CC-BY-NC-SA 4.0

AfriQA

Datasets

Cross-lingual open-retrieval question-answering dataset with human-translated QA pairs for 10 African languages (incl. Hausa, Igbo, Yoruba), totaling 12,159 examples across train/validation/test splits. From the Masakhane initiative.

Docs live

Language / NLP

Verified Jul 2026Free / CC-BY-SA 4.0

MasakhaNER 2.0

Datasets

Largest high-quality named-entity-recognition corpus for 20 African languages (incl. Nigerian Pidgin, Hausa, Igbo, Yoruba) with PER/ORG/LOC/DATE tags over news-domain text, totaling ~152,786 rows. Built by the Masakhane community.

Docs live

Language / NLP

Verified Jul 2026Free / CC-BY-NC 4.0

Tags

Compare AfriMMLU (IrokoBench)

Related in Datasets

Hausa Visual Genome (HausaVG)

AfriQA

MasakhaNER 2.0