AfriSpeech-Dialog vs Yoruba Speech-Text Parallel Corpus

A verified, side-by-side comparison. Both records are status-checked by Findra, so you are comparing what each actually offers today, not a stale listing.

AfriSpeech-Dialog Yoruba Speech-Text Parallel Corpus

Tags

speech, asr, audio, speaker-diarization, conversational

speech, tts, asr, yoruba, parallel-corpus

Links

Website Docs

Summary

Conversational African-accented speech corpus (~6 hours) of 50 two-speaker dialogues across 11 accents (Hausa, Yoruba, Igbo, Swahili, Sesotho and others) from Nigeria, Kenya and South Africa, for ASR and speaker diarization. By Intron Health.

Large Yoruba parallel speech-text corpus of 1,647,022 audio-text pairs (~21.5 GB, WAV) aligned with the MMS-300M Forced Aligner for ASR and TTS, with clips of 0.04-12 seconds.

Full details: AfriSpeech-Dialog Full details: Yoruba Speech-Text Parallel Corpus