In recent years, high-throughput sequencing technologies provide unprecedented opportunity to depict cancer samples at multiple molecular levels. The integration and analysis of these multi-omics datasets is a crucial and critical step to gain actionable knowledge in a precision medicine framework. This paper explores recent data-driven methodologies that have been developed and applied to respond major challenges of stratified medicine in oncology, including patients' phenotyping, biomarker discovery, and drug repurposing. We systematically retrieved peer-reviewed journals published from 2014 to 2019, select and thoroughly describe the tools presenting the most promising innovations regarding the integration of heterogeneous data, the machine learning methodologies that successfully tackled the complexity of multi-omics data, and the frameworks to deliver actionable results for clinical practice. The review is organized according to the applied methods: Deep learning, Network-based methods, Clustering, Features Extraction, and Transformation, Factorization. We provide an overview of the tools available in each methodological group and underline the relationship among the different categories. Our analysis revealed how multi-omics datasets could be exploited to drive precision oncology, but also current limitations in the development of multi-omics data integration.
Variant interpretation for the diagnosis of genetic diseases is a complex process. The American College of Medical Genetics and Genomics, with the Association for Molecular Pathology, have proposed a set of evidence‐based guidelines to support variant pathogenicity assessment and reporting in Mendelian diseases. Cardiovascular disorders are a field of application of these guidelines, but practical implementation is challenging due to the genetic disease heterogeneity and the complexity of information sources that need to be integrated. Decision support systems able to automate variant interpretation in the light of specific disease domains are demanded. We implemented CardioVAI (Cardio Variant Interpreter), an automated system for guidelines based variant classification in cardiovascular‐related genes. Different omics‐resources were integrated to assess pathogenicity of every genomic variant in 72 cardiovascular diseases related genes. We validated our method on benchmark datasets of high‐confident assessed variants, reaching pathogenicity and benignity concordance up to 83 and 97.08%, respectively. We compared CardioVAI to similar methods and analyzed the main differences in terms of guidelines implementation. We finally made available CardioVAI as a web resource (http://cardiovai.engenome.com/) that allows users to further specialize guidelines recommendations.
Genomic variant interpretation is a critical step of the diagnostic procedure, often supported by the application of tools that may predict the damaging impact of each variant or provide a guidelines-based classification. We propose the application of Machine Learning methodologies, in particular Penalized Logistic Regression, to support variant classification and prioritization. Our approach combines ACMG/AMP guidelines for germline variant interpretation as well as variant annotation features and provides a probabilistic score of pathogenicity, thus supporting the prioritization and classification of variants that would be interpreted as uncertain by the ACMG/AMP guidelines. We compared different approaches in terms of variant prioritization and classification on different datasets, showing that our data-driven approach is able to solve more variant of uncertain significance (VUS) cases in comparison with guidelines-based approaches and in silico prediction tools.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.