Research

AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages vs AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages

A verified, side-by-side comparison. Both records are status-checked by Findra, so you are comparing what each actually offers today, not a stale listing.

Category
Research
Research
Type
NLP benchmark
NLP benchmark
Country
🌍 Pan-African
🌍 Pan-African
Docs status
Docs live
Docs live
Licensing
Pricing
Free / open
Free / open
Verified
Unverified
Unverified
Last verified
5 Jul 2026
5 Jul 2026
Tags
african-languages, hate-speech, nlp-benchmark, content-moderation
african-languages, nlp-benchmark, sentiment-analysis, semeval
Summary
AfriHate is a multilingual benchmark of hate speech and abusive language datasets covering 15 African languages, annotated by native speakers. The paper contributes classification baselines and hate speech and offensive language lexicons, and analyses why keyword-based moderation fails for low-resource African languages. It was released on arXiv in January 2025.
AfriSenti is a sentiment analysis benchmark of more than 110,000 tweets in 14 African languages spanning four language families, annotated by native speakers. It underpinned SemEval-2023 Task 12, a shared task that attracted more than 200 participants, and documents data collection, annotation and baseline methods for low-resource languages.