SabiYarn-125M
SabiYarn-125M is a 125M-parameter decoder-only foundation model pretrained on Nigerian-language text, the first in the SabiYarn series. It supports English, Yoruba, Hausa, Igbo and Nigerian Pidgin plus Fulfulde, Efik and Urhobo, with fine-tuned variants for translation, NER, sentiment and diacritization. It was built by Aletheia.ai Research Lab and presented at the AfricaNLP 2025 workshop.
- Category
- AI Resources
- Pricing
- Open weights
- Country
- 馃嚦馃嚞 Nigeria
- Last verified
- 5 Jul 2026
Tags
Compare SabiYarn-125M
Side-by-side, verified specs against its closest llm alternatives.
Related in AI Resources
AfroLM
A multilingual masked language model pretrained from scratch on 23 African languages using a self-active learning framework, outperforming AfriBERTa, mBERT and XLMR-base on NER and sentiment tasks. Created by Bonaventure Dossou and collaborators, published at SustaiNLP/EMNLP 2022.
SERENGETI
A massively multilingual masked language model covering 517 African languages and varieties across five scripts, achieving state-of-the-art results on the AfroNLU benchmark. Developed by the UBC Deep Learning and NLP Lab as an Afrocentric resource.
N-ATLaS
Nigeria's first government-backed multilingual LLM (Sep 2025): a Llama-3 8B fine-tuned on 400M+ tokens across 4 Nigerian languages. Produced by NCAIR/NITDA and Awarri.
