AfriMMLU (IrokoBench)
Human-translated multiple-choice question-answering evaluation benchmark covering 16 to 17 African languages plus English and French, derived from a subset of MMLU across subjects like maths, geography and law. Distributed as CSV and parquet on HuggingFace and forms part of the IrokoBench suite (MMLU, MGSM, XNLI). Licensed Apache 2.0.
- Category
- Datasets
- Pricing
- Free / open
- Country
- 馃實 Pan-African
- Last verified
- 5 Jul 2026
Tags
Compare AfriMMLU (IrokoBench)
Side-by-side, verified specs against its closest language / nlp alternatives.
Related in Datasets
Hausa Visual Genome (HausaVG)
Multimodal Hausa-English dataset of 32,923 images with paired English/Hausa region descriptions (train/dev/test/challenge splits), post-edited by HausaNLP and Bayero University Kano translators for English-to-Hausa machine translation and image description.
AfriQA
Cross-lingual open-retrieval question-answering dataset with human-translated QA pairs for 10 African languages (incl. Hausa, Igbo, Yoruba), totaling 12,159 examples across train/validation/test splits. From the Masakhane initiative.
MasakhaNER 2.0
Largest high-quality named-entity-recognition corpus for 20 African languages (incl. Nigerian Pidgin, Hausa, Igbo, Yoruba) with PER/ORG/LOC/DATE tags over news-domain text, totaling ~152,786 rows. Built by the Masakhane community.
