Evaluation of the coverage and depth of transcriptome by RNA-Seq in chickens

Wang, Ying; Ghaffari, Noushin; Johnson, Charles D.; Braga-Neto, Ulisses; Wang, Huihui; Chen, Rui-rui; Zhou, Huaijun

doi:10.1186/1471-2105-12-s10-s5

Cited by 97 publications

(91 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Our results are in general agreement with Wang et al (2011) that 30-40 million reads is sufficient to be technically precise in measuring gene expression for most genes, which is not surprising given the methods used. We and Wang et al (2011) used raw count data, rather than the ''fragments per kilobase of exon model per million mapped reads'' (FPKM) normalized data as implemented in the Cufflinks software others, 2010, 2012).…”

Section: Discussionsupporting

confidence: 80%

“…We and Wang et al (2011) used raw count data, rather than the ''fragments per kilobase of exon model per million mapped reads'' (FPKM) normalized data as implemented in the Cufflinks software others, 2010, 2012). The mathematical derivations necessary for sample size computation depend on using a valid model for the count data, a task that is tractable for the raw count data but would be much more difficult after the per-isoform scaling of FPKM.…”

Section: Discussionmentioning

confidence: 99%

“…They concluded that only 6% of genes are within 10% of their true expression level when 100 million reads are sequenced, but the percentage of genes jumped to 72% when five-fold more reads are sequenced. In contrast, Wang et al (2011) suggested that only 30 million reads are necessary to quantify gene expression in chicken lungs, and that 10 million reads could reliably estimate the level of expression of 80% of genes. This broad range of estimates, and the consequences for planning experiments, provides an attractive research opportunity to clarify the influence of variability.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Calculating Sample Size Estimates for RNA Sequencing Data

Hart

Therneau

Zhang

et al. 2013

Journal of Computational Biology

253

216

View full text Add to dashboard Cite

Background: Given the high technical reproducibility and orders of magnitude greater resolution than microarrays, next-generation sequencing of mRNA (RNA-Seq) is quickly becoming the de facto standard for measuring levels of gene expression in biological experiments. Two important questions must be taken into consideration when designing a particular experiment, namely, 1) how deep does one need to sequence? and, 2) how many biological replicates are necessary to observe a significant change in expression?Results: Based on the gene expression distributions from 127 RNA-Seq experiments, we find evidence that 91% -4% of all annotated genes are sequenced at a frequency of 0.1 times per million bases mapped, regardless of sample source. Based on this observation, and combining this information with other parameters such as biological variation and technical variation that we empirically estimate from our large datasets, we developed a model to estimate the statistical power needed to identify differentially expressed genes from RNASeq experiments.Conclusions: Our results provide a needed reference for ensuring RNA-Seq gene expression studies are conducted with the optimally sample size, power, and sequencing depth. We also make available both R code and an Excel worksheet for investigators to calculate for their own experiments.

show abstract

Section: Discussionsupporting

confidence: 80%

Section: Discussionmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Calculating Sample Size Estimates for RNA Sequencing Data

Hart

Therneau

Zhang

et al. 2013

Journal of Computational Biology

253

216

View full text Add to dashboard Cite

show abstract

“…(ii) Sequencing efforts in the range of 5-20 M mapped reads per sample provide sufficient depth to accurately quantify gene expression across a broad range of expression levels in diverse eukaryotic transcriptomes (Tarazona et al 2011;Wang et al 2011;Hart et al 2013;Vijay et al 2013;Ching et al 2014;Liu et al 2014;Williams et al 2014). For example, Hart et al (2013) examined expression distributions for 127 RNAseq experiments (six replicated studies; human and zebrafish), finding that 10 M mapped reads were sufficient to cover approximately 90% of transcripts with >10 reads in a range of biosamples (cell lines, tissue/organ and population comparisons).…”

Section: More Sequence Is Not Necessarily Bettermentioning

confidence: 99%

“…2). Here, given the relatively low cost of sequencing, pilot work can be expanded to obtain~100 M paired-end reads (>100 bp), recommended in the current literature as sufficient to capture the majority of RNAs expressed in eukaryotic samples (Wang et al 2011;Francis et al 2013;Vijay et al 2013;Wolf 2013). …”

Section: Designing Better Rna-seq Experimentsmentioning

confidence: 99%

The power and promise of RNA‐seq in ecology and evolution

2016

View full text Add to dashboard Cite

Reference is regularly made to the power of new genomic sequencing approaches. Using powerful technology, however, is not the same as having the necessary power to address a research question with statistical robustness. In the rush to adopt new and improved genomic research methods, limitations of technology and experimental design may be initially neglected. Here, we review these issues with regard to RNA sequencing (RNA-seq). RNA-seq adds large-scale transcriptomics to the toolkit of ecological and evolutionary biologists, enabling differential gene expression (DE) studies in nonmodel species without the need for prior genomic resources. High biological variance is typical of field-based gene expression studies and means that larger sample sizes are often needed to achieve the same degree of statistical power as clinical studies based on data from cell lines or inbred animal models. Sequencing costs have plummeted, yet RNA-seq studies still underutilize biological replication. Finite research budgets force a trade-off between sequencing effort and replication in RNAseq experimental design. However, clear guidelines for negotiating this trade-off, while taking into account study-specific factors affecting power, are currently lacking. Study designs that prioritize sequencing depth over replication fail to capitalize on the power of RNA-seq technology for DE inference. Significant recent research effort has gone into developing statistical frameworks and software tools for power analysis and sample size calculation in the context of RNA-seq DE analysis. We synthesize progress in this area and derive an accessible rule-of-thumb guide for designing powerful RNA-seq experiments relevant in eco-evolutionary and clinical settings alike.

show abstract

Experimental approaches for gene regulatory network construction: The chick as a model system

Streit

Tambalo

Chen

et al. 2012

Genesis

View full text Add to dashboard Cite

Setting up the body plan during embryonic development requires the coordinated action of many signals and transcriptional regulators in a precise temporal sequence and spatial pattern. The last decades have seen an explosion of information describing the molecular control of many developmental processes. The next challenge is to integrate this information into logic ‘wiring diagrams’ that visualise gene actions and outputs, have predictive power and point to key control nodes. Here we provide an experimental workflow on how to construct gene regulatory networks using the chick as model system. Keywords: transcription factors, transcriptome analysis, conserved regulatory elements

show abstract

Evaluation of the coverage and depth of transcriptome by RNA-Seq in chickens

Cited by 97 publications

References 17 publications

Calculating Sample Size Estimates for RNA Sequencing Data

Calculating Sample Size Estimates for RNA Sequencing Data

The power and promise of RNA‐seq in ecology and evolution

Experimental approaches for gene regulatory network construction: The chick as a model system

Contact Info

Product

Resources

About