The emerging importance of embryonic development research rapidly increases the volume for a professional resource related to multi-omics data. However, the lack of global embryogenesis repository and systematic analysis tools limits the preceding in stem cell research, human congenital diseases and assisted reproduction. Here, we developed the EmAtlas, which collects the most comprehensive multi-omics data and provides multi-scale tools to explore spatiotemporal activation during mammalian embryogenesis. EmAtlas contains data on multiple types of gene expression, chromatin accessibility, DNA methylation, nucleosome occupancy, histone modifications, and transcription factors, which displays the complete spatiotemporal landscape in mouse and human across several time points, involving gametogenesis, preimplantation, even fetus and neonate, and each tissue involves various cell types. To characterize signatures involved in the tissue, cell, genome, gene and protein levels during mammalian embryogenesis, analysis tools on these five scales were developed. Additionally, we proposed EmRanger to deliver extensive development-related biological background annotations. Users can utilize these tools to analyze, browse, visualize, and download data owing to the user-friendly interface. EmAtlas is freely accessible at http://bioinfor.imu.edu.cn/ematlas.
Chronic diseases, because of insidious onset and long latent period, have become the major global disease burden. However, the current chronic disease diagnosis methods based on genetic markers or imaging analysis are challenging to promote completely due to high costs and cannot reach universality and popularization. This study analyzed massive data from routine blood and biochemical test of 32 448 patients and developed a novel framework for cost-effective chronic disease prediction with high accuracy (AUC 87.32%). Based on the best-performing XGBoost algorithm, 20 classification models were further constructed for 17 types of chronic diseases, including 9 types of cancers, 5 types of cardiovascular diseases and 3 types of mental illness. The highest accuracy of the model was 90.13% for cardia cancer, and the lowest was 76.38% for rectal cancer. The model interpretation with the SHAP algorithm showed that CREA, R-CV, GLU and NEUT% might be important indices to identify the most chronic diseases. PDW and R-CV are also discovered to be crucial indices in classifying the three types of chronic diseases (cardiovascular disease, cancer and mental illness). In addition, R-CV has a higher specificity for cancer, ALP for cardiovascular disease and GLU for mental illness. The association between chronic diseases was further revealed. At last, we build a user-friendly explainable machine-learning-based clinical decision support system (DisPioneer: http://bioinfor.imu.edu.cn/dispioneer) to assist in predicting, classifying and treating chronic diseases. This cost-effective work with simple blood tests will benefit more people and motivate clinical implementation and further investigation of chronic diseases prevention and surveillance program.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.