urum wheat (DW), Triticum turgidum L. ssp. durum (Desf.) Husn., genome BBAA, is a cereal grain mainly used for pasta production and evolved from domesticated emmer wheat (DEW), T. turgidum ssp. dicoccum (Schrank ex Schübl.) Thell. DEW itself derived from wild emmer wheat (WEW), T. turgidum ssp. dicoccoides (Körn. ex Asch. & Graebn.
PURPOSE Recurrently mutated genes and chromosomal abnormalities have been identified in myelodysplastic syndromes (MDS). We aim to integrate these genomic features into disease classification and prognostication. METHODS We retrospectively enrolled 2,043 patients. Using Bayesian networks and Dirichlet processes, we combined mutations in 47 genes with cytogenetic abnormalities to identify genetic associations and subgroups. Random-effects Cox proportional hazards multistate modeling was used for developing prognostic models. An independent validation on 318 cases was performed. RESULTS We identify eight MDS groups (clusters) according to specific genomic features. In five groups, dominant genomic features include splicing gene mutations ( SF3B1, SRSF2, and U2AF1) that occur early in disease history, determine specific phenotypes, and drive disease evolution. These groups display different prognosis (groups with SF3B1 mutations being associated with better survival). Specific co-mutation patterns account for clinical heterogeneity within SF3B1- and SRSF2-related MDS. MDS with complex karyotype and/or TP53 gene abnormalities and MDS with acute leukemia–like mutations show poorest prognosis. MDS with 5q deletion are clustered into two distinct groups according to the number of mutated genes and/or presence of TP53 mutations. By integrating 63 clinical and genomic variables, we define a novel prognostic model that generates personally tailored predictions of survival. The predicted and observed outcomes correlate well in internal cross-validation and in an independent external cohort. This model substantially improves predictive accuracy of currently available prognostic tools. We have created a Web portal that allows outcome predictions to be generated for user-defined constellations of genomic and clinical features. CONCLUSION Genomic landscape in MDS reveals distinct subgroups associated with specific clinical features and discrete patterns of evolution, providing a proof of concept for next-generation disease classification and prognosis.
BackgroundNowadays, the increasing availability of omics data, due to both the advancements in the acquisition of molecular biology results and in systems biology simulation technologies, provides the bases for precision medicine. Success in precision medicine depends on the access to healthcare and biomedical data. To this end, the digitization of all clinical exams and medical records is becoming a standard in hospitals. The digitization is essential to collect, share, and aggregate large volumes of heterogeneous data to support the discovery of hidden patterns with the aim to define predictive models for biomedical purposes. Patients’ data sharing is a critical process. In fact, it raises ethical, social, legal, and technological issues that must be properly addressed.ResultsIn this work, we present an infrastructure devised to deal with the integration of large volumes of heterogeneous biological data. The infrastructure was applied to the data collected between 2010–2016 in one of the major diagnostic analysis laboratories in Italy. Data from three different platforms were collected (i.e., laboratory exams, pathological anatomy exams, biopsy exams). The infrastructure has been designed to allow the extraction and aggregation of both unstructured and semi-structured data. Data are properly treated to ensure data security and privacy. Specialized algorithms have also been implemented to process the aggregated information with the aim to obtain a precise historical analysis of the clinical activities of one or more patients. Moreover, three Bayesian classifiers have been developed to analyze examinations reported as free text. Experimental results show that the classifiers exhibit a good accuracy when used to analyze sentences related to the sample location, diseases presence and status of the illnesses.ConclusionsThe infrastructure allows the integration of multiple and heterogeneous sources of anonymized data from the different clinical platforms. Both unstructured and semi-structured data are processed to obtain a precise historical analysis of the clinical activities of one or more patients. Data aggregation allows to perform a series of statistical assessments required to answer complex questions that can be used in a variety of fields, such as predictive and precision medicine. In particular, studying the clinical history of patients that have developed similar pathologies can help to predict or individuate markers able to allow an early diagnosis of possible illnesses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.