The Russian language has an inflective structure and does not have a strict word order, which generates processing problems such as part-of-speech homonymy. The paper addresses this issue. The existing approaches to resolving the morphological homonymy problem can be divided into the following groups: rule-based approaches, statistical approaches, machine learning approaches, and combined methods. In the paper, we showed that each approach has its advantages and disadvantages; however, we can achieve a much higher precision of the algorithm by combining several approaches. The combined method based on neural networks gives better results than others (98% precision obtained). We used the following features: normalizing substitutions, grammatical and syntactic characteristics, vector representation of the word, and word forms. All the experiments were performed on the part of the National Corpus of the Russian Language with homonymy resolution. The analysis of the corpus revealed that the most frequent types of homonymy occurred between function words: a particle vs an interjection (14%), and a preposition vs an interjection (13.2%).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.