HTSeq—a Python framework to work with high-throughput sequencing data

Anders, Simon; Pyl, Paul Theodor; Huber, Wolfgang

doi:10.1093/bioinformatics/btu638

Cited by 17,851 publications

(12,527 citation statements)

References 12 publications

Supporting

Mentioning

12,009

Contrasting

Unclassified

Order By: Relevance

“…Mapping of poly(A)-selected reads was performed with TopHat v. 2.0.8b [77] using default settings. Read counts were generated with HTSeq-count v. 0.6.1p1 [78] and differential expression analysis was performed using DESeq2 package v. 1.8.1 [79]. Coverage of RNA reads were analyzed and visualized with Integrative Genomics viewer [80,81].…”

Section: Methodsmentioning

confidence: 99%

Global characterization of the Dicer-like protein DrnB roles in miRNA biogenesis in the social amoeba Dictyostelium discoideum

et al. 2018

View full text Add to dashboard Cite

Micro (mi)RNAs regulate gene expression in many eukaryotic organisms where they control diverse biological processes. Their biogenesis, from primary transcripts to mature miRNAs, have been extensively characterized in animals and plants, showing distinct differences between these phylogenetically distant groups of organisms. However, comparably little is known about miRNA biogenesis in organisms whose evolutionary position is placed in between plants and animals and/or in unicellular organisms. Here, we investigate miRNA maturation in the unicellular amoeba Dictyostelium discoideum, belonging to Amoebozoa, which branched out after plants but before animals. High-throughput sequencing of small RNAs and poly(A)-selected RNAs demonstrated that the Dicer-like protein DrnB is required, and essentially specific, for global miRNA maturation in D. discoideum. Our RNA-seq data also showed that longer miRNA transcripts, generally preceded by a T-rich putative promoter motif, accumulate in a drnB knock-out strain. For two model miRNAs we defined the transcriptional start sites (TSSs) of primary (pri)-miRNAs and showed that they carry the RNA polymerase II specific m7G-cap. The generation of the 3ʹ-ends of these pri-miRNAs differs, with pri-mir-1177 reading into the downstream gene, and pri-mir-1176 displaying a distinct end. This 3´-end is processed to shorter intermediates, stabilized in DrnB-depleted cells, of which some carry a short oligo(A)-tail. Furthermore, we identified 10 new miRNAs, all DrnB dependent and developmentally regulated. Thus, the miRNA machinery in D. discoideum shares features with both plants and animals, which is in agreement with its evolutionary position and perhaps also an adaptation to its complex lifestyle: unicellular growth and multicellular development.

show abstract

Section: Methodsmentioning

confidence: 99%

Global characterization of the Dicer-like protein DrnB roles in miRNA biogenesis in the social amoeba Dictyostelium discoideum

et al. 2018

View full text Add to dashboard Cite

show abstract

“…DGE (i.e., testing for changes in the overall transcriptional output of a gene) is typically performed by applying a count-based inference method from statistical packages such as edgeR 12 or DESeq2 11 to gene counts obtained by read counting software such as featureCounts 1 , HTSeq-count 2 or functions from the GenomicAlignments 22 R package. A lot has been written about how simple counting approaches are prone to give erroneous results for genes with changes in relative isoform usage, due to the direct dependence of the observed read count on the transcript length 23 .…”

Section: Incorporating Transcript-level Estimates Leads To More Accurmentioning

confidence: 99%

“…Currently, one of the most common approaches is to define a set of non-overlapping targets (typically, genes) and use the number of reads overlapping a target as a measure of its abundance, or expression level. Several software packages have been developed for performing such “simple” counting (e.g., featureCounts 1 and HTSeq-count 2 ). More recently, the field has seen a surge in methods aimed at quantifying the abundances of individual transcripts (e.g., Cufflinks 3 , RSEM 4 , BitSeq 5 , kallisto 6 and Salmon 7 ).…”

Section: Introductionmentioning

confidence: 99%

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences

2015

View full text Add to dashboard Cite

High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transcripts. Several different quantification approaches have been proposed, ranging from simple counting of reads that overlap given genomic regions to more complex estimation of underlying transcript abundances. In this paper, we show that gene-level abundance estimates and statistical inference offer advantages over transcript-level analyses, in terms of performance and interpretability. We also illustrate that while the presence of differential isoform usage can lead to inflated false discovery rates in differential expression analyses on simple count matrices and transcript-level abundance estimates improve the performance in simulated data, the difference is relatively minor in several real data sets. Finally, we provide an R package ( tximport) to help users integrate transcript-level abundance estimates from common quantification pipelines into count-based statistical inference engines.

show abstract

“…The raw reads were filtered by Seqtk and then mapped to the M. tb H37Rv strain reference sequence (GenBank NC_018143.1) using Bowtie2 (version: 2–2.0.5) [43]. Counting of reads per gene was performed using HTSeq followed by TMM (trimmed mean of M-values) normalization [44,45]. Differentially expressed genes were defined as those with a false discovery rate <0.05 and fold-change >2 using the edgeR software [46].…”

Section: Methodsmentioning

confidence: 99%

Transcription factors Rv0081 and Rv3334 connect the early and the enduring hypoxic response of Mycobacterium tuberculosis

et al. 2018

View full text Add to dashboard Cite

The ability of Mycobacterium tuberculosis (M. tb) to survive and persist in the host for decades in an asymptomatic state is an important aspect of tuberculosis pathogenesis. Although adaptation to hypoxia is thought to play a prominent role underlying M. tb persistence, how the bacteria achieve this goal is largely unknown. Rv0081, a member of the DosR regulon, is induced at the early stage of hypoxia while Rv3334 is one of the enduring hypoxic response genes. In this study, we uncovered genetic interactions between these two transcription factors. RNA-seq analysis of ΔRv0081 and ΔRv3334 revealed that the gene expression profiles of these two mutants were highly similar. We also found that under hypoxia, Rv0081 positively regulated the expression of Rv3334 while Rv3334 repressed transcription of Rv0081. In addition, we demonstrated that Rv0081 formed dimer and bound to the promoter region of Rv3334. Taken together, these data suggest that Rv0081 and Rv3334 work in the same regulatory pathway and that Rv3334 functions immediately downstream of Rv0081. We also found that Rv3334 is a bona fide regulator of the enduring hypoxic response genes. Our study has uncovered a regulatory pathway that connects the early and the enduring hypoxic response, revealing a transcriptional cascade that coordinates the temporal response of M. tb to hypoxia.

show abstract

HTSeq—a Python framework to work with high-throughput sequencing data

Cited by 17,851 publications

References 12 publications

Global characterization of the Dicer-like protein DrnB roles in miRNA biogenesis in the social amoeba Dictyostelium discoideum

Global characterization of the Dicer-like protein DrnB roles in miRNA biogenesis in the social amoeba Dictyostelium discoideum

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences

Transcription factors Rv0081 and Rv3334 connect the early and the enduring hypoxic response of Mycobacterium tuberculosis

Contact Info

Product

Resources

About