The state and development of the intestinal epithelium is vital for infant health, and increased understanding in this area has been limited by an inability to directly assess epithelial cell biology in the healthy newborn intestine. To that end, we have developed a novel, noninvasive, molecular approach that utilizes next generation RNA sequencing on stool samples containing intact epithelial cells for the purpose of quantifying intestinal gene expression. We then applied this technique to compare host gene expression in healthy term and extremely preterm infants. Bioinformatic analyses demonstrate repeatable detection of human mRNA expression, and network analysis shows immune cell function and inflammation pathways to be up-regulated in preterm infants. This study provides incontrovertible evidence that whole-genome sequencing of stool-derived RNA can be used to examine the neonatal host epithelial transcriptome in infants, which opens up opportunities for sequential monitoring of gut gene expression in response to dietary or therapeutic interventions.
BackgroundSequencing datasets consist of a finite number of reads which map to specific regions of a reference genome. Most effort in modeling these datasets focuses on the detection of univariate differentially expressed genes. However, for classification, we must consider multiple genes and their interactions.ResultsThus, we introduce a hierarchical multivariate Poisson model (MP) and the associated optimal Bayesian classifier (OBC) for classifying samples using sequencing data. Lacking closed-form solutions, we employ a Monte Carlo Markov Chain (MCMC) approach to perform classification. We demonstrate superior or equivalent classification performance compared to typical classifiers for two synthetic datasets and over a range of classification problem difficulties. We also introduce the Bayesian minimum mean squared error (MMSE) conditional error estimator and demonstrate its computation over the feature space. In addition, we demonstrate superior or leading class performance over an RNA-Seq dataset containing two lung cancer tumor types from The Cancer Genome Atlas (TCGA).ConclusionsThrough model-based, optimal Bayesian classification, we demonstrate superior classification performance for both synthetic and real RNA-Seq datasets. A tutorial video and Python source code is available under an open source license at http://bit.ly/1gimnss.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-014-0401-3) contains supplementary material, which is available to authorized users.
There is mounting evidence that noncoding microRNAs (miRNA) are modulated by select chemoprotective dietary agents. For example, recently we demonstrated that the unique combination of dietary fish oil (containing n-3 fatty acids) plus pectin (fermented to butyrate in the colon) (FPA) up-regulates a subset of putative tumor suppressor miRNAs in intestinal mucosa, and down-regulates their predicted target genes following carcinogen exposure as compared to control (corn oil plus cellulose (CCA)) diet. To further elucidate the biological effects of diet and carcinogen modulated miR’s in the colon, we verified that miR-26b and miR-203 directly target PDE4B and TCF4, respectively. Since perturbations in adult stem cell dynamics are generally believed to represent an early step in colon tumorigenesis and to better understand how the colonic stem cell population responds to environmental factors such as diet and carcinogen, we additionally determined the effects of the chemoprotective FPA diet on miRNAs and mRNAs in colonic stem cells obtained from Lgr5-EGFP-IRES-creERT2 knock-in mice. Following global miRNA profiling, 26 miRNAs (P <0.05) were differentially expressed in Lgr5high stem cells as compared to Lgr5negative differentiated cells. FPA treatment up-regulated miR-19b, miR-26b and miR-203 expression as compared to CCA specifically in Lgr5high cells. In contrast, in Lgr5negative cells, only miR-19b and its indirect target PTK2B were modulated by the FPA diet. These data indicate for the first time that select dietary cues can impact stem cell regulatory networks, in part, by modulating the steady-state levels of miRNAs. To our knowledge, this is the first study to utilize Lgr5+ reporter mice to determine the impact of diet and carcinogen on miRNA expression in colonic stem cells and their progeny.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.