Named Entity Recognition System for Igala Language: Design, Implementation and Evaluation

Sani Felix Ayegba *

Department of Computer Science, Salem University, Lokoja, Nigeria.

*Author to whom correspondence should be addressed.


Abstract

The development of natural language processing resources for low-resource languages remains a critical challenge in computational linguistics, particularly for languages with millions of speakers but limited digital representation. This study presents the design, implementation, and evaluation of the first dedicated Named Entity Recognition (NER) system for the Igala language, a Yoruboid language spoken by approximately two million people in Kogi State, Nigeria. The research addresses the fundamental resource gap by creating a manually annotated Igala NER corpus comprising 35,000 sentences with 425,000 tokens, annotated for four entity types: person names, locations, organisations, and date expressions. Three NER architectures were implemented and evaluated: a Conditional Random Field (CRF) baseline model, a Bidirectional Long Short-Term Memory with Conditional Random Field (BiLSTM-CRF) neural model, and a fine-tuned African-focused transformer model (AfroXLMR-base). The BiLSTM-CRF model achieved the highest overall performance with an F1-score of 86.4%, followed by AfroXLMR-base with 84.7% and CRF with 79.2%. Statistical significance testing confirmed that the differences between models are significant (p < 0.05). Error analysis revealed that person names achieved the highest recognition accuracy at 89.1%, while organisations proved most challenging at 81.3% due to morphological complexity and limited contextual patterns. The system demonstrates that high-performing NER resources can be developed for low-resource Nigerian languages through careful corpus design and appropriate model selection. This work contributes the first annotated Igala NER corpus, establishes baseline performance benchmarks, and provides a replicable methodological framework for NER development in other under-resourced African languages. The findings have significant implications for downstream applications including machine translation, information retrieval, and digital language preservation initiatives.

Keywords: IGALA language, named entity recognition, low-resource NLP, African languages, deep learning


How to Cite

Ayegba, Sani Felix. 2026. “Named Entity Recognition System for Igala Language: Design, Implementation and Evaluation”. Journal of Advances in Mathematics and Computer Science 41 (5):168-82. https://doi.org/10.9734/jamcs/2026/v41i52144.

Downloads

Download data is not yet available.