AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages vs MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition

A verified, side-by-side comparison. Both records are status-checked by Findra, so you are comparing what each actually offers today, not a stale listing.

AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition

Tags

african-languages, hate-speech, nlp-benchmark, content-moderation

named-entity-recognition, african-languages, nlp-benchmark, transfer-learning

Links

Website

Summary

AfriHate is a multilingual benchmark of hate speech and abusive language datasets covering 15 African languages, annotated by native speakers. The paper contributes classification baselines and hate speech and offensive language lexicons, and analyses why keyword-based moderation fails for low-resource African languages. It was released on arXiv in January 2025.

MasakhaNER 2.0 introduces the largest human-annotated named entity recognition dataset for 20 African languages and studies Africa-centric cross-lingual transfer learning. The paper reports that choosing the best transfer language improves zero-shot F1 by an average of 14 points across the 20 languages compared with transferring from English.

Full details: AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages Full details: MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition