AI Resources
AfroLM vs SabiYarn-125M
A verified, side-by-side comparison. Both records are status-checked by Findra, so you are comparing what each actually offers today, not a stale listing.
Category
AI Resources
AI Resources
Type
LLM
LLM
Country
🌍 Pan-African
🇳🇬 Nigeria
Docs status
Docs live
Docs live
Licensing
Pricing
Free / open weights
Open weights
Verified
Verified
Unverified
Last verified
5 Jul 2026
5 Jul 2026
Tags
nlp, african-languages, masked-language-model, active-learning, data-efficient
yoruba, hausa, igbo, nigerian-languages, gpt
Summary
A multilingual masked language model pretrained from scratch on 23 African languages using a self-active learning framework, outperforming AfriBERTa, mBERT and XLMR-base on NER and sentiment tasks. Created by Bonaventure Dossou and collaborators, published at SustaiNLP/EMNLP 2022.
SabiYarn-125M is a 125M-parameter decoder-only foundation model pretrained on Nigerian-language text, the first in the SabiYarn series. It supports English, Yoruba, Hausa, Igbo and Nigerian Pidgin plus Fulfulde, Efik and Urhobo, with fine-tuned variants for translation, NER, sentiment and diacritization. It was built by Aletheia.ai Research Lab and presented at the AfricaNLP 2025 workshop.