AfriHate Hate Speech Datasets vs AfriQA

A verified, side-by-side comparison. Both records are status-checked by Findra, so you are comparing what each actually offers today, not a stale listing.

AfriHate Hate Speech Datasets AfriQA

Tags

nlp, african-languages, hate-speech, abusive-language, twitter

nlp, african-languages, question-answering, cross-lingual, open-retrieval

Links

Website Docs

Website Docs GitHub

Summary

Multilingual collection of hate speech and abusive language datasets covering 15 African languages, built from tweets annotated by native speakers. Each instance carries labels from 3 to 4 annotators with anonymous annotator IDs, downloadable on HuggingFace. Published at NAACL 2025.

Cross-lingual open-retrieval question-answering dataset with human-translated QA pairs for 10 African languages (incl. Hausa, Igbo, Yoruba), totaling 12,159 examples across train/validation/test splits. From the Masakhane initiative.

Full details: AfriHate Hate Speech Datasets Full details: AfriQA