Microsoft Research has introduced Paza, an initiative aimed at promoting voice technology for low-resource languages. Paza includes PazaBench, a leaderboard for automatic speech recognition (ASR) focused on languages with data scarcity, and Paza ASR models, optimized for use in real-world contexts.

PazaBench: a new ASR leaderboard

PazaBench is the first ASR leaderboard dedicated to low-resource languages, with initial coverage of 39 African languages and 52 state-of-the-art ASR and language models. The platform aggregates public and community-sourced datasets, making it easier to evaluate model performance in different languages and regions.

PazaBench tracks three core metrics:

  1. Character Error Rate (CER): important for languages with complex word forms.
  2. Word Error Rate (WER): for word-level transcription accuracy.
  3. RTFx (Inverse Real-Time Factor): measures how fast transcription runs relative to real-time audio duration.

Paza ASR Models: built with and for Kenyan languages

The Paza ASR models consist of three fine-tuned ASR models, based on state-of-the-art architectures. Each model targets Swahili (a mid-resource language) and five low-resource Kenyan languages: Dholuo, Kalenjin, Kikuyu, Maasai, and Somali. The models have been optimized using public and proprietary datasets.

Paza models include:

  1. Paza-Phi-4-Multimodal-Instruct: a next-generation language model, optimized for transcription in six languages.
  2. Paza-MMS-1B-All: a model optimized on Meta's mms-1b-all model, which improves transcription accuracy while maintaining cross-lingual generalization.
  3. Paza-Whisper-Large-v3-Turbo: a model optimized on OpenAI's whisper-large-v3-turbo base model, which offers reliable ASR capabilities.

Microsoft intends to expand PazaBench beyond African languages and evaluate state-of-the-art ASR models in a larger number of low-resource languages globally. The company is also developing practical guides to help the ecosystem curate datasets, optimize models, and evaluate them in real-world conditions.