Research registry
Research
Papers, benchmarks, standards and reference architectures for African digital infrastructure.
17 results
Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning
This paper introduces multilingual adaptive fine-tuning (MAFT) applied to 17 of the most-resourced African languages, producing the AfroXLMR family of models. Removing non-African-script tokens cuts model size by roughly 50 percent while matching the accuracy of single-language adaptation on named entity recognition, topic classification and sentiment analysis.
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages
AfriHate is a multilingual benchmark of hate speech and abusive language datasets covering 15 African languages, annotated by native speakers. The paper contributes classification baselines and hate speech and offensive language lexicons, and analyses why keyword-based moderation fails for low-resource African languages. It was released on arXiv in January 2025.
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
This paper introduces AfriMTE, human translation-evaluation data with simplified annotation guidelines for 13 African languages, and AfriCOMET, a learned machine translation quality metric built on an African-centric multilingual encoder. It addresses the difficulty of measuring translation progress for under-resourced African languages and reports improved correlation with human judgment over existing metrics.
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages
AfriQA is the first cross-lingual open-retrieval question answering benchmark for African languages, with more than 12,000 XOR-QA examples across 10 African languages. The paper shows that current automatic translation and multilingual retrieval methods perform poorly for these languages, where in-language digital content is scarce.
AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
AfriSenti is a sentiment analysis benchmark of more than 110,000 tweets in 14 African languages spanning four language families, annotated by native speakers. It underpinned SemEval-2023 Task 12, a shared task that attracted more than 200 participants, and documents data collection, annotation and baseline methods for low-resource languages.
Digital Public Infrastructure and Development: A World Bank Group Approach
This World Bank Group paper sets out a shared approach to digital public infrastructure, framed around interoperable digital identity, digital payments and data-exchange layers as the foundation for inclusive digital economies. It provides definitions, guiding principles and safeguards to help countries plan and govern DPI. Much of the World Bank's DPI work targets low- and middle-income countries, including across Africa.
GSMA Mobile Economy Africa 2025
GSMA Intelligence's Mobile Economy Africa 2025 report quantifies the mobile industry's economic and social contribution across the continent. It reports that the mobile sector contributed 220 billion US dollars, about 7.7 percent of GDP, in 2024, and projects growth toward 270 billion dollars by 2030. The report also covers 4G and 5G adoption, the mobile internet usage gap, and AI and satellite connectivity trends.
GSMA Mobile Money API Specification
The GSMA Mobile Money API is a harmonised, industry-governed REST and JSON specification covering common mobile money use cases such as merchant payments, disbursements, P2P and international transfers, bill payments and account linking. It defines a standard set of transaction types abstracted from provider-specific implementations and ships a developer portal with a sandbox simulator, an OAuth 2.0 gateway and SDKs. Version 1.2 is the current release.
GSMA State of the Industry Report on Mobile Money 2026
The GSMA State of the Industry Report on Mobile Money is the industry's definitive annual reference, prepared by the GSMA Mobile Money programme. The 2026 edition (published March 2026) reports that more than 2 trillion US dollars flowed through mobile money in 2025 and that the industry reached 2.3 billion registered accounts, up by 268 million. Most new registered and active accounts came from Sub-Saharan Africa, the industry's largest region.
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
IrokoBench is a human-translated evaluation benchmark covering 17 typologically diverse low-resource African languages across three tasks: natural language inference (AfriXNLI), mathematical reasoning (AfriMGSM) and knowledge-based multiple-choice QA (AfriMMLU). The paper evaluates open and proprietary LLMs and documents a large gap between high-resource languages and African languages, with the best open model reaching about 63 percent of GPT-4o performance. It was published at NAACL 2025.
Level One Project Guide (Gates Foundation)
The Level One Project Guide is the Gates Foundation reference model for inclusive instant payment systems and pro-poor digital financial services. It sets out principles and a reference architecture for real-time retail payments that connect banks and non-bank providers through a shared national switch, and it informed the design of Mojaloop. The current guide edition was published in 2025.
MOSIP (Modular Open Source Identity Platform) Specifications
MOSIP is a modular open-source foundational digital identity platform that governments use to build national ID systems while retaining ownership and avoiding vendor lock-in. Its specifications build on open standards including OAuth 2.0 and OpenID Connect, OpenAPI, ISO/IEC 19794 biometrics and CBEFF, and add the Claim 169 QR specification for offline identity verification. It is used by multiple African governments as the basis of their foundational digital ID programmes.
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition
MasakhaNER 2.0 introduces the largest human-annotated named entity recognition dataset for 20 African languages and studies Africa-centric cross-lingual transfer learning. The paper reports that choosing the best transfer language improves zero-shot F1 by an average of 14 points across the 20 languages compared with transferring from English.
Mojaloop Open API for FSP Interoperability Specification
The Mojaloop specification set defines an open API for interoperable transactions between financial service providers, enabling a payer at one provider to pay a payee at another. It documents the FSP Interoperability (FSPIOP), Administration, Settlement and Third-party Payment Initiation APIs and is published under a Creative Commons license. It is the reference standard behind several national instant payment switches used in Africa and beyond.
Open Banking Nigeria API Standard
The Open Banking Nigeria API Standard is a RESTful specification for secure data sharing and payment initiation between Nigerian banks and third-party providers. It defines Registration, Meta Directory, Data-Inquiry (Open Data) and Payment API categories and is maintained by the Open Technology Foundation. It aligns with the Central Bank of Nigeria's open banking regulatory framework, for which the CBN issued Operational Guidelines in 2023.
Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages
This landmark Masakhane paper proposes a participatory research model that lets non-specialist speakers meaningfully contribute to building machine translation for their own languages. The work released novel translation datasets and MT benchmarks for more than 30 African languages, with human evaluations for about a third of them. It was published in Findings of EMNLP 2020.
World Bank ID4D Practitioner's Guide
The ID4D Practitioner's Guide is the World Bank's reference for designing and implementing inclusive, trusted digital identification systems, organised around the Principles on Identification for Sustainable Development. It covers system design, technology choices, procurement and governance, and is a core reference for identity as digital public infrastructure. The ID4D initiative focuses heavily on low- and middle-income countries, with much of its flagship work in Africa.
