Background Complex disease classification is an important part of the complex disease diagnosis and personalized treatment process. It has been shown that the integration of multi-omics data can analyze and classify complex diseases more accurately, because multi-omics data are highly correlated with the onset and progression of various diseases and can provide comprehensive and complementary information about a disease. However, multi-omics data of complex diseases are usually characterized by high imbalance, scale variation, high data heterogeneity and high noise interference, which pose great challenges to multi-omics integration methods. Results We propose a novel multi-omics data integration learning model called MODILM, to obtain more important and complementary information for complex disease classification from multiple omics data. Specifically, MODILM first initially constructs a similarity network for each omics data using cosine similarity measure, then learns the sample-specific features and intra-association features of single-omics data from the similarity networks using Graph Attention Networks, then maps them uniformly to a new feature space to further strengthen and extract high-level omics-specific features of the omics data using Multilayer Perceptron networks. MODILM then uses a View Correlation Discovery Network to fuse the high-level omics-specific features extracted from each omics data and further learn cross-omics features in the label space, providing unique class-level distinctiveness to classify complex diseases. We conducted extensive experiments on six benchmark datasets having the miRNA expression data, mRNA and DNA methylation data to demonstrate the superiority of our MODILM. The experimental results show that MODILM outperforms state-of-the-art methods, effectively improving the accuracy of complex disease classification. Conclusions Our MODILM provides a more competitive way to extract and integrate important and complementary information from multiple omics data, providing a very promising tool for supporting decision making for clinical diagnosis.
Background Accurately classifying complex diseases is crucial for diagnosis and personalized treatment. Integrating multi-omics data has been demonstrated to enhance the accuracy of analyzing and classifying complex diseases. This can be attributed to the highly correlated nature of the data with various diseases, as well as the comprehensive and complementary information it provides. However, integrating multi-omics data for complex diseases is challenged by data characteristics such as high imbalance, scale variation, heterogeneity, and noise interference. These challenges further emphasize the importance of developing effective methods for multi-omics data integration. Results We proposed a novel multi-omics data learning model called MODILM, which integrates multiple omics data to improve the classification accuracy of complex diseases by obtaining more significant and complementary information from different single-omics data. Our approach includes four key steps: 1) constructing a similarity network for each omics data using the cosine similarity measure, 2) leveraging Graph Attention Networks to learn sample-specific and intra-association features from similarity networks for single-omics data, 3) using Multilayer Perceptron networks to map learned features to a new feature space, thereby strengthening and extracting high-level omics-specific features, and 4) fusing these high-level features using a View Correlation Discovery Network to learn cross-omics features in the label space, which results in unique class-level distinctiveness for complex diseases. To demonstrate the effectiveness of MODILM, we conducted experiments on six benchmark datasets consisting of miRNA expression, mRNA, and DNA methylation data. Our results show that MODILM outperforms state-of-the-art methods, effectively improving the accuracy of complex disease classification. Conclusions Our MODILM provides a more competitive way to extract and integrate important and complementary information from multiple omics data, providing a very promising tool for supporting decision-making for clinical diagnosis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.