Tools registry

Developer Tools

SDKs, libraries, validators and utilities that solve problems specific to African markets.

9 results in NLP Library

Hausa-NLP

Tools

A community resource hub for Hausa NLP providing Hausa corpus, sentiment lexicons (incl. translated lexicons) and resources for sentiment analysis, hate-speech detection and machine translation. Maintained by the HausaNLP community.

Docs live
NLP Library
Verified Jun 2026Free / open-source

HornMorpho

Tools

HornMorpho is a Python program that performs morphological analysis and generation of Amharic, Oromo and Tigrinya words, breaking words into constituent morphemes and generating words from roots and grammatical structure. It originated from the L3 Project at Indiana University.

Docs live
NLP Library
Verified Jun 2026Free / open-source

PuoBERTa

Tools

PuoBERTa is a RoBERTa-based masked language model purpose-built for Setswana, trained on the PuoData corpus by the Data Science for Social Impact group. It ships with example scripts for fill-mask, news classification, NER and POS tagging via HuggingFace Transformers.

Docs live
NLP Library
Verified Jun 2026Free / open-source

SOMALI_NLP

Tools

SOMALI_NLP is a Python NLP toolkit for the Somali language providing stop-word lists, stemmers for morphological analysis, tokenizers, collocation analysis and string-distance and spelling models. It draws on a companion Somali Wikipedia corpus.

Docs live
NLP Library
Verified Jun 2026Free / open-source

amseg

Tools

amseg is an Amharic document segmentation and normalization tool that splits Ethiopic text into sentences and tokens, normalizes character variants and transliterates between Latin and Fidel. Maintained under the University of Hamburg Semantic Models for Amharic project.

Docs live
NLP Library
Verified Jun 2026Free / open-source

etnltk

Tools

The Ethiopian Natural Language Toolkit, a spaCy/NLTK-inspired Python (PyPI etnltk) library for Amharic and other Ethiopian languages, providing text normalization, short-form expansion and word/sentence tokenization. Maintained by robeleq.

Docs live
NLP Library
Verified Jun 2026Free / open-source

iranlowo

Tools

A Python (PyPI iranlowo) utility library to analyse and preprocess Yoruba text: diacritic stripping/restoration via pretrained models, text normalization, character verification and corpus tools. Maintained by the Niger-Volta-LTI organization.

Docs live
NLP Library
Verified Jun 2026Free / open-source

stopwords-sw

Tools

A comprehensive Swahili (sw) stopwords collection distributed in JSON and text formats (npm/bower stopwords-sw) for text preprocessing in NLP pipelines. Maintained by the stopwords-iso project.

Docs live
NLP Library
Verified Jun 2026Free / open-source

uroman

Tools

uroman is a universal romanizer that converts text in virtually any script to the Latin alphabet, with dedicated handling for Amharic and the Ge'ez/Ethiopic script. It also adds initial support for Coptic and processes script-native numerals.

Docs live
NLP Library
Verified Jun 2026Free / open-source