Datasets
AfriMMLU (IrokoBench) vs Hausa Visual Genome (HausaVG)
A verified, side-by-side comparison. Both records are status-checked by Findra, so you are comparing what each actually offers today, not a stale listing.
Category
Datasets
Datasets
Type
Language / NLP
Language / NLP
Country
🌍 Pan-African
🇳🇬 Nigeria
Docs status
Docs live
Docs live
Licensing
Pricing
Free / open
Free / CC-BY-NC-SA 4.0
Verified
Unverified
Verified
Last verified
5 Jul 2026
5 Jul 2026
Tags
nlp, african-languages, question-answering, evaluation, benchmark
nlp, hausa, machine-translation, multimodal, image-captioning
Summary
Human-translated multiple-choice question-answering evaluation benchmark covering 16 to 17 African languages plus English and French, derived from a subset of MMLU across subjects like maths, geography and law. Distributed as CSV and parquet on HuggingFace and forms part of the IrokoBench suite (MMLU, MGSM, XNLI). Licensed Apache 2.0.
Multimodal Hausa-English dataset of 32,923 images with paired English/Hausa region descriptions (train/dev/test/challenge splits), post-edited by HausaNLP and Bayero University Kano translators for English-to-Hausa machine translation and image description.