Author(s): Felermino Ali, Gabriel de Jesus, Henrique Cardoso, Rui Sousa-Silva, Sérgio Nunes .
Published in the 16th International Conference on ComputationalbProcessing of the Portuguese Language (PROPOR 2024), Santiago de Compostela, Galicia, Spain, 12–15 March, 2024
This paper introduces a network-based approach for automatic stopword detection in low-resource languages, tested on Tetun and Emakhuwa. By leveraging co-occurrence network properties, the method outperforms traditional frequency-based techniques, offering a scalable solution for NLP in under-resourced linguistic contexts.
Download Paper