Automatically recognizing an existing semantic relation (such as "is a", "part of", "property of", "opposite of" etc.) between two arbitrary words (phrases, concepts, etc.) is an important task affecting many information retrieval and artificial intelligence tasks including query expansion, common-sense reasoning, question answering, and database federation. Currently, two classes of approaches exist to classify a relation between words (concepts) X and Y: (1) path-based and (2) distributional. While the path-based approaches look at word-paths connecting X and Y in text, the distributional approaches look at statistical properties of X and Y separately, not necessary in the proximity of each other. Here, we suggest how both types can be improved and empirically compare them using several standard benchmarking datasets. For our distributional approach, we are suggesting using an attentionbased transformer. While they are known to be capable of supporting knowledge transfer between different tasks, and recently set a number of benchmarking records in various applications, we are the first to successfully apply them to the task of recognizing semantic relations. To improve a path-based approach, we are suggesting our original neural word path model that combines useful properties of convolutional and recurrent networks, and thus addressing several shortcomings from the prior path-based models. Both our models significantly outperforms the state-of-the-art within its type accordingly. Our transformer-based approach outperforms current state-of-the-art by 1-12% points on 4 out of 6 standard benchmarking datasets. This results in 15-40% error reduction and is closing the gap between the automated and human performance by up to 50%. It also needs much less training data than prior approaches. For the ease of re-producing our results, we make our source code and trained models publicly available.During the last few years, Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) have resulted in major breakthroughs and are behind
Several applications dealing with natural language text involve automated validation of the membership in a given category (e.g. France is a country, Gladiator is a movie, but not a country). Meta-learning is a recent and powerful machine learning approach, which goal is to train a model (or a family of models) on a variety of learning tasks, such that it can solve new learning tasks in a more efficient way, e.g. using smaller number of training samples or in less time. We present an original approach inspired by meta-learning and consisting of two tiers of models: for any arbitrary category, our general model supplies high confidence training instances (seeds) for our category-specific models. Our general model is based on pattern matching and optimized for the precision at top N, while its recall is not important. Our category-specific models are based on recurrent neural networks (RNN-s), which recently showed themselves extremely effective in several natural language applications, such as machine translation, sentiment analysis, parsing, and chatbots. By following the meta-learning principles, we are training our highest level (general) model in such a way that our second-tier category-specific models (which are dependent on it) are optimized for the best possible performance in a specific application. This work is important because our approach is capable of verifying membership in an arbitrary category defined by a sequence of words including longer and more complex categories such as Ridley Scott movie or City in southern Germany that are currently not supported by existing manually created ontologies (such as Freebase, Wordnet or Wikidata). Also, our approach uses only raw text, and thus can be useful when there are no such ontologies available, which is a common situation with languages other than English. Even the largest English ontologies are known to have low coverage, insufficient for many practical applications such as automated question answering, which we use here to illustrate the advantages of our approach. We rigorously test it on a number of questions larger than the previous studies and demonstrate that when coupled with a simple answer-scoring mechanism, our meta-learning-inspired approach 1) provides up to 50% improvement over prior approaches that do not use any manually curated knowledge bases and 2) achieves the state ofthe-art performance among all the current approaches including those taking advantage of such knowledge bases.
Transformer-based pre-trained Language Models (PLMs) have emerged as the foundations for the current state-of-the-art algorithms in most natural language processing tasks, in particular when applied to context rich data such as sentences or paragraphs. However, their impact on the tasks defined in terms of abstract individual word properties, not necessary tied to their specific use in a particular sentence, has been inadequately explored, which is a notable research gap. Addressing this gap is crucial for advancing our understanding of natural language processing. To fill this void, we concentrate on classification of semantic relations: given a pair of concepts (words or word sequences) the aim is to identify the semantic label to describe their relationship. E.g. in the case of the pair green/colour, “is a” is a suitable relation while “part of”, “property of”, and “opposite of” are not suitable. This classification is independent of a particular sentence in which these concepts might have been used. We are first to incorporate a language model into both existing approaches to this task, namely path-based and distribution-based methods. Our transformer-based approaches exhibit significant improvements over the state-of-the-art and come remarkably close to achieving human-level performance on rigorous benchmarks. We are also first to provide evidence that the standard datasets over-state the performance due to the effect of “lexical memorisation.” We reduce this effect by applying lexical separation. On the new benchmark datasets, the algorithmic performance remains significantly below human-level, highlighting that the task of semantic relation classification is still unresolved, particularly for language models of the sizes commonly used at the time of our study. We also identify additional challenges that PLM-based approaches face and conduct extensive ablation studies and other experiments to investigate the sensitivity of our findings to specific modelling and implementation choices. Furthermore, we examine the specific relations that pose greater challenges and discuss the trade-offs between accuracy and processing time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.