Quantifying cell proportions, especially for rare cell types in some scenarios, is of great value in tracking signals associated with certain phenotypes or diseases. Although some methods have been proposed to infer cell proportions from multicomponent bulk data, they are substantially less effective for estimating the proportions of rare cell types which are highly sensitive to feature outliers and collinearity. Here we proposed a new deconvolution algorithm named ARIC to estimate cell type proportions from gene expression or DNA methylation data. ARIC employs a novel two-step marker selection strategy, including collinear feature elimination based on the component-wise condition number and adaptive removal of outlier markers. This strategy can systematically obtain effective markers for weighted $\upsilon$-support vector regression to ensure a robust and precise rare proportion prediction. We showed that ARIC can accurately estimate fractions in both DNA methylation and gene expression data from different experiments. We further applied ARIC to the survival prediction of ovarian cancer and the condition monitoring of chronic kidney disease, and the results demonstrate the high accuracy and robustness as well as clinical potentials of ARIC. Taken together, ARIC is a promising tool to solve the deconvolution problem of bulk data where rare components are of vital importance.
Detecting cancer signals in cell-free DNA (cfDNA) high-throughput sequencing data is emerging as a novel noninvasive cancer detection method. Due to the high cost of sequencing, it is crucial to make robust and precise predictions with low-depth cfDNA sequencing data. Here we propose a novel approach named DISMIR, which can provide ultrasensitive and robust cancer detection by integrating DNA sequence and methylation information in plasma cfDNA whole-genome bisulfite sequencing (WGBS) data. DISMIR introduces a new feature termed as ‘switching region’ to define cancer-specific differentially methylated regions, which can enrich the cancer-related signal at read-resolution. DISMIR applies a deep learning model to predict the source of every single read based on its DNA sequence and methylation state and then predicts the risk that the plasma donor is suffering from cancer. DISMIR exhibited high accuracy and robustness on hepatocellular carcinoma detection by plasma cfDNA WGBS data even at ultralow sequencing depths. Further analysis showed that DISMIR tends to be insensitive to alterations of single CpG sites’ methylation states, which suggests DISMIR could resist to technical noise of WGBS. All these results showed DISMIR with the potential to be a precise and robust method for low-cost early cancer detection.
Hematopoietic stem cells (HSCs) build up the blood system throughout lifespan. N6-methyladenosine (m6A), the most prevalent RNA modification, modulates gene expression via the processes of “writing” and “reading”. Recent studies showed that m6A “writer” genes (Mettl3 and Mettl14) play an essential role in HSCs. However, which reader deciphers the m6A modification to modulate HSCs remains unknown. In this study, we observed that dysfunction of Ythdf3 and Ccnd1 severely impaired the reconstitution capacity of HSCs, which phenocopies Mettl3 deficient HSCs. Dysfunction of Ythdf3 and Mettl3 results in the translational defect of Ccnd1. Ythdf3 and Mettl3 regulates HSCs by transmitting m6A RNA methylation on the 5’UTR of Ccnd1. Enforced Ccnd1 completely rescues the defect of Ythdf3-/- HSCs and partially rescues Mettl3-compromised HSCs. Taken together, this study for the first time identified that Ccnd1 is the target of METTL3 and YTHDF3 to transmit m6A RNA methylation signal to regulate HSCs reconstitution capacity.
Motivation Cell-free DNA (cfDNA) is gaining substantial attention from both biological and clinical fields as a promising marker for liquid biopsy. Many aspects of disease-related features have been discovered from cfDNA high-throughput sequencing (HTS) data. However, there is still a lack of integrative and systematic tools for cfDNA HTS data analysis and quality control (QC). Results Here, we propose cfDNApipe, an easy-to-use and systematic python package for cfDNA whole-genome sequencing (WGS) and whole-genome bisulfite sequencing (WGBS) data analysis. It covers the entire analysis pipeline for the cfDNA data, including raw sequencing data processing, QC and sophisticated statistical analysis such as detecting copy number variations (CNVs), differentially methylated regions (DMRs) and DNA fragment size alterations. cfDNApipe provides one-command-line-execution pipelines and flexible application programming interfaces for customized analysis. Availability https://xwanglabthu.github.io/cfDNApipe/ Supplementary information Supplementary data are available at Bioinformatics online.
Detecting cancer signals in cell-free DNA (cfDNA) high-throughput sequencing data is emerging as a novel non-invasive cancer detection method. Due to the high cost of sequencing, it is crucial to make robust and precise prediction with low-depth cfDNA sequencing data. Here we propose a novel approach named DISMIR, which can provide ultrasensitive and robust cancer detection by integrating DNA sequence and methylation information in plasma cfDNA whole genome bisulfite sequencing (WGBS) data. DISMIR introduces a new feature termed as “switching region” to define cancer-specific differentially methylated regions, which can enrich the cancer-related signal at read-resolution. DISMIR applies a deep learning model to predict the source of every single read based on its DNA sequence and methylation state, and then predicts the risk that the plasma donor is suffering from cancer. DISMIR exhibited high accuracy and robustness on hepatocellular carcinoma detection by plasma cfDNA WGBS data even at ultra-low sequencing depths. Analysis showed that DISMIR tends to be insensitive to alterations of single CpG sites’ methylation states, which suggests DISMIR could resist to technical noise of WGBS. All these results showed DISMIR with the potential to be a precise and robust method for low-cost early cancer detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.