In the process of knowledge discovery, the reliability of results depends upon the effectiveness of attributes selected for decision. The curse of dimensionality refers to the phenomenon in which the excessive number of dimensions affect the analysis. In order to eradicate the curse of dimensionality in text analysis, we are proposing an ontology-based semantic measure for intelligent selection/reduction of features. Among the various text mining techniques, ontology-based mining has a significant contribution to the field. The ontology-based semantic measures, which are mathematical models used to find the similarity between various concepts in the ontology, have made a significant contribution to feature engineering. The proposed measure is an amalgamation of semantic similarity, relatedness, and distance. The measure allows performing an in-depth analysis of various semantic relationships between concepts of the English language. The performance of the measure was evaluated against benchmarked dimension reduction techniques such as PCA. The results show improvement by reducing the size of dimensions up to 35%. The results were further evaluated by training a classifier to validate that the features are not creating any underfit/overfit model. INDEX TERMSFeature engineering, dimension reduction, semantic measures, ontology.
Due to the proliferation of data generating devices such as sensors in scientific applications, data integration has become most challenging task since the data stemming from these devices are extremely heterogeneous in terms of structure (schema) and semantics (interpretation). In practice, integration and transformation is typically performed by the scientists manually; in fact extensive efforts are required. The approaches for automating data integration task as much as possible are badly needed. DaltOn is a generic framework that offers various functionalities for managing the data in scientific applications. In this paper, we present DaltOn's functionality for automating data integration task based on exploitation of ontologies. In addition, we also elaborate the specific module of our framework which is responsible for implementing the functionality. At last, we also present core algorithms that demonstrate a good evaluation of our approach.
Abstract. It is a common characteristic of scientific applications to require the integration of information coming from multiple sources. This aspect usually confronts end-users with data management issues which involve the transportation of data from one system to another as well as the syntactic and semantic integration of data, i.e. data come in different formats and have different meanings. In order to deal with these issues in a systematic and well structured way, we propose a sophisticated framework based on process modeling. In this paper, we present the three major conceptual architectural abstractions of the system and detail its execution.
Recently; AI based methods are frequently used in healthcare industry to unfold historical hindsight to explore the insight and envisage the foresight. For example, identification of epidemiological patterns of thyroid disease in targeted area(s) supports healthcare industry stakeholders (government agencies, health organizations, NGOs, policy makers and so on) in formulating proper policies to combat such kind of fatal diseases. Also, predictive Future Visualization (FV) of prevalence patterns of the thyroid disease is really helpful for these stakeholders to properly focus on specific area(s). This paper offers a system so called TDV: Intelligent System for Thyroid Disease Visualization, which offers a potential surveillance pattern of thyroid disease to policy makers for next ten years (2013-2022) by presenting thyroid disease prevalence facts of past ten year (2002-2012). The methodology of our system comprises upon three main steps, in first step, we apply data preprocessing techniques. In second step; we construct the decision model using Time Series Regression (TSR) in R software, finally we visualized the results by using a geographic map plotted in Q-GIS. As per results of our approach, we conclude that thyroid disease may increase more than 15% for next ten years in age group 21-30 and female gender is more prone to be affected from thyroid disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.