AI Resources
SabiYarn-125M vs SERENGETI
A verified, side-by-side comparison. Both records are status-checked by Findra, so you are comparing what each actually offers today, not a stale listing.
Category
AI Resources
AI Resources
Type
LLM
LLM
Country
🇳🇬 Nigeria
🌍 Pan-African
Docs status
Docs live
Docs live
Licensing
Institutional only
Pricing
Open weights
Free for research; commercial use requires contacting authors
Verified
Unverified
Verified
Last verified
5 Jul 2026
5 Jul 2026
Tags
yoruba, hausa, igbo, nigerian-languages, gpt
nlp, masked-language-model, 517-languages, afrocentric, ubc-nlp
Summary
SabiYarn-125M is a 125M-parameter decoder-only foundation model pretrained on Nigerian-language text, the first in the SabiYarn series. It supports English, Yoruba, Hausa, Igbo and Nigerian Pidgin plus Fulfulde, Efik and Urhobo, with fine-tuned variants for translation, NER, sentiment and diacritization. It was built by Aletheia.ai Research Lab and presented at the AfricaNLP 2025 workshop.
A massively multilingual masked language model covering 517 African languages and varieties across five scripts, achieving state-of-the-art results on the AfroNLU benchmark. Developed by the UBC Deep Learning and NLP Lab as an Afrocentric resource.