N-gram based language models are very popular and extensively used statistical methods for solving various natural language processing problems including grammar checking. Smoothing is one of the most effective techniques used in building a language model to deal with data sparsity problem. Kneser-Ney is one of the most prominently used and successful smoothing technique for language modelling. In our previous work, we presented a Witten-Bell smoothing based language modelling technique for checking grammatical correctness of Bangla sentences which showed promising results outperforming previous methods. In this work, we proposed an improved method using Kneser-Ney smoothing based n-gram language model for grammar checking and performed a comparative performance analysis between Kneser-Ney and Witten-Bell smoothing techniques for the same purpose. We also provided an improved technique for calculating the optimum threshold which further enhanced the the results. Our experimental results show that, Kneser-Ney outperforms Witten-Bell as a smoothing technique when used with n-gram LMs for checking grammatical correctness of Bangla sentences.
The type of an entity is a key piece of information to understand what an entity is and how it relates to other entities mentioned in a document. Search engine result pages (SERPs) often surface facts and entity type information from a background Knowledge Graph (KG) in response to queries that carry a semantic information need. In a KG, an entity usually holds multiple type properties. It is then important to, given an entity in a KG, rank entity types attached to the entity by relevance to a certain user and information need as not always the most popular type is the most informative within a textual context. In this paper we address the entity type ranking problem by means of KG embedding models. In our work, we show that entity type ranking can be seen as a special case of the KG completion problem. Embeddings can be learned from both the structural and probabilistic information of the entities. We propose a Representation Learning model for Type Ranking (RL-TRank) and the results of the structure embedding and the probabilistic embedding are combined to get the entity type ranking. Experimental results show that the accuracy of RL-TRank approaches outperform the state-ofthe-art type ranking models while, at the same time, being more efficient and scalable.
Most of the online queries target entities and the type of an entity is a key piece of information. Entity type helps us to understand what an entity is and how it relates to other entities mentioned in a document. Search engine result pages (SERPs) often surface facts and entity type information from a background Knowledge Graph (KG) in response to queries that carry a semantic information need. In a KG, an entity usually holds multiple type properties. For example, popular types attached to the entity `Donald Trump' via rdfs:type statements might be Person, Businessman, and Leader. However, other types of this entity, e.g., Solicitor, Restaurateur, and Writer might also be interesting to some users. Unpopular entity types can be useful for tail queries like, for example, `Is Donald Trump an American Television Producer' or `Is Donald Trump an American Television Actor'. It is then important to, given an entity in a KG, rank entity types attached to the entity by relevance to a certain user and information need as not always the most popular type is the most informative within a textual context. In this paper we address the entity type ranking problem by means of KG embedding models. In our work, we show that entity type ranking can be seen as a special case of the KG completion problem. Embeddings can be learned from the structural, probabilistic and contextual description information of the entities. We propose and evaluate our methods to find the most relevant entity type based on collection statistics and on the graph structure interconnecting entities and types. Experimental results show that our proposed approaches outperform the state-of-the-art type ranking models while, at the same time, being more efficient and scalable. Our approach focuses on the task of ranking a set of types associated to an entity in a background knowledge graph to select the most relevant types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.