BackgroundMethods for the integrative analysis of multi-omics data are required to draw a more complete and accurate picture of the dynamics of molecular systems. The complexity of biological systems, the technological limits, the large number of biological variables and the relatively low number of biological samples make the analysis of multi-omics datasets a non-trivial problem.Results and ConclusionsWe review the most advanced strategies for integrating multi-omics datasets, focusing on mathematical and methodological aspects.
PURPOSE Recurrently mutated genes and chromosomal abnormalities have been identified in myelodysplastic syndromes (MDS). We aim to integrate these genomic features into disease classification and prognostication. METHODS We retrospectively enrolled 2,043 patients. Using Bayesian networks and Dirichlet processes, we combined mutations in 47 genes with cytogenetic abnormalities to identify genetic associations and subgroups. Random-effects Cox proportional hazards multistate modeling was used for developing prognostic models. An independent validation on 318 cases was performed. RESULTS We identify eight MDS groups (clusters) according to specific genomic features. In five groups, dominant genomic features include splicing gene mutations ( SF3B1, SRSF2, and U2AF1) that occur early in disease history, determine specific phenotypes, and drive disease evolution. These groups display different prognosis (groups with SF3B1 mutations being associated with better survival). Specific co-mutation patterns account for clinical heterogeneity within SF3B1- and SRSF2-related MDS. MDS with complex karyotype and/or TP53 gene abnormalities and MDS with acute leukemia–like mutations show poorest prognosis. MDS with 5q deletion are clustered into two distinct groups according to the number of mutated genes and/or presence of TP53 mutations. By integrating 63 clinical and genomic variables, we define a novel prognostic model that generates personally tailored predictions of survival. The predicted and observed outcomes correlate well in internal cross-validation and in an independent external cohort. This model substantially improves predictive accuracy of currently available prognostic tools. We have created a Web portal that allows outcome predictions to be generated for user-defined constellations of genomic and clinical features. CONCLUSION Genomic landscape in MDS reveals distinct subgroups associated with specific clinical features and discrete patterns of evolution, providing a proof of concept for next-generation disease classification and prognosis.
BackgroundBreast cancer is one of the most common cancer types. Due to the complexity of this disease, it is important to face its study with an integrated and multilevel approach, from genes, transcripts and proteins to molecular networks, cell populations and tissues. According to the systems biology perspective, the biological functions arise from complex networks: in this context, concepts like molecular pathways, protein-protein interactions (PPIs), mathematical models and ontologies play an important role for dissecting such complexity.ResultsIn this work we present the Genes-to-Systems Breast Cancer (G2SBC) Database, a resource which integrates data about genes, transcripts and proteins reported in literature as altered in breast cancer cells. Beside the data integration, we provide an ontology based query system and analysis tools related to intracellular pathways, PPIs, protein structure and systems modelling, in order to facilitate the study of breast cancer using a multilevel perspective. The resource is available at the URL http://www.itb.cnr.it/breastcancer.ConclusionsThe G2SBC Database represents a systems biology oriented data integration approach devoted to breast cancer. By means of the analysis capabilities provided by the web interface, it is possible to overcome the limits of reductionist resources, enabling predictions that can lead to new experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.