Abstract. The introduction of the Semantic Web vision and the shift toward machine understandable Web resources has unearthed the importance of automatic semantic reconciliation. Consequently, new tools for automating the process were proposed. In this work we present a formal model of semantic reconciliation and analyze in a systematic manner the properties of the process outcome, primarily the inherent uncertainty of the matching process and how it reflects on the resulting mappings. An important feature of this research is the identification and analysis of factors that impact the effectiveness of algorithms for automatic semantic reconciliation, leading, it is hoped, to the design of better algorithms by reducing the uncertainty of existing algorithms. Against this background we empirically study the aptitude of two algorithms to correctly match concepts. This research is both timely and practical in light of recent attempts to develop and utilize methods for automatic semantic reconciliation.
Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially synthesize new labeled data for supervised learning. We mainly focus on cases with scarce labeled data. Our method, referred to as language-model-based data augmentation (LAMBADA), involves fine-tuning a state-of-the-art language generator to a specific task through an initial training phase on the existing (usually small) labeled data. Using the fine-tuned model and given a class label, new sentences for the class are generated. Our process then filters these new sentences by using a classifier trained on the original data. In a series of experiments, we show that LAMBADA improves classifiers' performance on a variety of datasets. Moreover, LAMBADA significantly improves upon the state-of-the-art techniques for data augmentation, specifically those applicable to text classification tasks with little data.
Decision makers often need to take into account multiple conflicting objectives when selecting a solution for their problem. This can result in a potentially large number of candidate solutions to be considered. Visualizing a Pareto Frontier, the optimal set of solutions to a multi-objective problem, is considered a difficult task when the problem at hand spans more than three objective functions. We introduce a novel visual-interactive approach to facilitate coping with multi-objective problems. We propose a characterization of the Pareto Frontier data and the tasks decision makers face as they reach their decisions. Following a comprehensive analysis of the design alternatives, we show how a semantically-enhanced Self-Organizing Map, can be utilized to meet the identified tasks. We argue that our newly proposed design provides both consistent orientation of the 2D mapping as well as an appropriate visual representation of individual solutions. We then demonstrate its applicability with two real-world multi-objective case studies. We conclude with a preliminary empirical evaluation and a qualitative usefulness assessment.
Before requirements analysis takes place in a business context, business analysis is usually performed. Important concerns emerge during this analysis that need to be captured and communicated to requirements engineers. In this paper, we take the position that tagging is a promising approach for identifying and organizing these concerns. The fact that tags can be attached freely to entities, often with multiple tags attached to the same entity and the same tag attached to multiple entities, leads to multi-dimensional structures that are suitable for representing crosscutting concerns and exploring their relationships. The resulting tag structures can be hardened into classifications that capture and communicate important concerns.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.