AI registry

AI Resources

African language models, speech & text datasets, and AI infrastructure.

All AI Infra / Compute · 2 AI Lab · 4 Data Platform · 1 Infrastructure · 1 LLM · 5 Libraries · 1 NLP Model · 6 Security · 1 Speech (ASR/TTS) · 8 TTS / ASR · 1 Translation · 3

6 results in NLP Model

AfriBERTa

AfriBERTa is a multilingual masked language model (XLM-RoBERTa architecture, ~126M params) pretrained from scratch on 11 African languages including Amharic, Hausa, Igbo, Swahili, and Yoruba. Built by the Castorini lab (University of Waterloo) for text classification and Named Entity Recognition on low-resource African languages.

Docs live

NLP Model

Verified Jun 2026

AfriTeVa V2

An improved T5 v1.1 model (428M params) pretrained on the Wura corpus covering 16 African languages, with gains on classification, translation, summarization and cross-lingual QA. Published at EMNLP 2023 by the Castorini group with African lead authors.

Docs live

NLP Model

Verified Jun 2026Free / open weights

AfroLID

A neural language identification toolkit that detects which of 517 African languages and varieties a text belongs to across 14 language families, reaching 97.41 macro-F1 after fine-tuning on SERENGETI. Developed by the UBC Deep Learning and NLP Lab and published at EMNLP 2022.

Docs live

Institutional only

NLP Model

Verified Jun 2026Free for research use

AfroXLMR

AfroXLMR is an XLM-R-large model (0.6B params) adapted to African languages via multilingual adaptive fine-tuning, covering 17 African languages plus Arabic, French, and English. Created by David Adelani (Davlan) and published at COLING 2022 for cross-lingual transfer tasks like NER.

Docs live

NLP Model

Verified Jun 2026

Cheetah

A massively multilingual natural language generation model supporting 517 African languages, outperforming baselines on five of seven AfroNLG tasks like summarization and translation. Developed by the UBC Deep Learning and NLP Lab and published at ACL 2024.

Docs live

Institutional only

NLP Model

Verified Jun 2026Free for research use

GhanaNLP ABENA

ABENA (A BERT Now in Akan) is a family of BERT, DistilBERT and RoBERTa language models for the Twi/Akan language covering both Asante and Akuapem dialects, released by the open-source GhanaNLP initiative. Distinct from GhanaNLP's Khaya translation product.

Docs live

NLP Model

Verified Jun 2026Free