AfriHate Hate Speech Datasets
Multilingual collection of hate speech and abusive language datasets covering 15 African languages, built from tweets annotated by native speakers. Each instance carries labels from 3 to 4 annotators with anonymous annotator IDs, downloadable on HuggingFace. Published at NAACL 2025.
- Category
- Datasets
- Pricing
- Free / open
- Country
- 馃實 Pan-African
- Last verified
- 5 Jul 2026
Tags
Compare AfriHate Hate Speech Datasets
Side-by-side, verified specs against its closest language / nlp alternatives.
Related in Datasets
Hausa Visual Genome (HausaVG)
Multimodal Hausa-English dataset of 32,923 images with paired English/Hausa region descriptions (train/dev/test/challenge splits), post-edited by HausaNLP and Bayero University Kano translators for English-to-Hausa machine translation and image description.
AfriQA
Cross-lingual open-retrieval question-answering dataset with human-translated QA pairs for 10 African languages (incl. Hausa, Igbo, Yoruba), totaling 12,159 examples across train/validation/test splits. From the Masakhane initiative.
MasakhaNER 2.0
Largest high-quality named-entity-recognition corpus for 20 African languages (incl. Nigerian Pidgin, Hausa, Igbo, Yoruba) with PER/ORG/LOC/DATE tags over news-domain text, totaling ~152,786 rows. Built by the Masakhane community.
