Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time.
The biochemistry of RNA-Seq library preparation results in cDNA fragments that are not uniformly distributed within the transcripts they represent. This non-uniformity must be accounted for when estimating expression levels, and we show how to perform the needed corrections using a likelihood based approach. We find improvements in expression estimates as measured by correlation with independently performed qRT-PCR and show that correction of bias leads to improved replicability of results across libraries and sequencing technologies.
SUMMARY
Trimethylamine N-oxide (TMAO), a gut microbiota dependent metabolite, both enhances atherosclerosis in animal models and is associated with cardiovascular risks in clinical studies. Here we investigate the impact of targeted inhibition of the first step in TMAO generation, commensal microbial trimethylamine (TMA) production, on diet-induced atherosclerosis. A structural analogue of choline, 3,3-dimethyl-1-butanol (DMB), is shown to non-lethally inhibit TMA formation from cultured microbes, to inhibit distinct microbial TMA lyases, and to both inhibit TMA production from physiologic polymicrobial cultures (eg intestinal contents, human feces) and reduce TMAO levels in mice fed a high choline or carnitine diet. DMB inhibited choline diet-enhanced endogenous macrophage foam cell formation and atherosclerotic lesion development in apolipoprotein e−/− mice without alterations in circulating cholesterol levels. The present studies suggest gut microbial production of TMA specifically, and non-lethal microbial inhibitors in general, may serve as a potential therapeutic approach for the treatment of cardiometabolic diseases.
We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.