2017
DOI: 10.12688/f1000research.12223.1
|View full text |Cite
|
Sign up to set email alerts
|

recount workflow: Accessing over 70,000 human RNA-seq samples with Bioconductor

Abstract: The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recountBioconductor package. This workflow explains in detail how to use the recountpackage and how to integrate it with other Bioconductor packages for several analyses that can be carried out with the recount2 resource. In particular, we describe how the coverage count matrices were computed in recount2 as well as… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
44
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 51 publications
(45 citation statements)
references
References 28 publications
1
44
0
Order By: Relevance
“…Interactive display). Once a study of interest has been identified, the user can download the gene expression data from recount2 (Collado-Torres et al, 2017c) and append the recount-brain metadata as shown in Figure 2 ; this process is equivalent to appending custom metadata from Figure 2 of the recount workflow (Collado-Torres et al, 2017a) . Once the expression data from recount2 and the sample metadata from recount-brain have been combined, the user can proceed to perform analyses such as identification of differentially expressed genes and enriched gene ontologies, examples of them are illustrated in Figure 2 with data from SRA study SRP027383 (Bao et al, 2014) .…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Interactive display). Once a study of interest has been identified, the user can download the gene expression data from recount2 (Collado-Torres et al, 2017c) and append the recount-brain metadata as shown in Figure 2 ; this process is equivalent to appending custom metadata from Figure 2 of the recount workflow (Collado-Torres et al, 2017a) . Once the expression data from recount2 and the sample metadata from recount-brain have been combined, the user can proceed to perform analyses such as identification of differentially expressed genes and enriched gene ontologies, examples of them are illustrated in Figure 2 with data from SRA study SRP027383 (Bao et al, 2014) .…”
Section: Resultsmentioning
confidence: 99%
“…Because the SRA is made up of individual submissions, data are not provided in a consistent format and annotations (such as methodology and technical sequencing details) are often unclear or missing (Bernstein et al, 2017) . We previously developed recount2 (Collado-Torres et al, 2017c, 2017a , a public resource with over 70,000 uniformly processed human RNA-seq samples, enabling comparisons across studies for human expression data in the public repository. However, inconsistent phenotype annotation still remains a barrier to taking advantage of public uniformly processed reads.…”
Section: Introductionmentioning
confidence: 99%
“…Previously called exon-exon junction data including phenotype table, bed and coverage files for both TCGA and GTEx v6 were downloaded from the recount2 service at https://jhubiostatistics.shinyapps.io/recount/ (Collado-Torres et al, 2017). The metaSRA (Bernstein et al, 2017) web query form at http://metasra.biostat.wisc.edu/ was queried for experiment accession numbers for 1) non-cancer cell and tissue type samples (see Table S1 for cancer-matched samples and Table S3 for non-cancer samples, and "Selection of SRA tissue and cell types" for a description of how these samples were chosen) and 2) TCGAmatched cancer types (see Table S1).…”
Section: Methods Details Data Downloadmentioning
confidence: 99%
“…One of the best examples of this approach is DEE2 -Digital Expression Explorer 2 (DEE2) [3] -a repository of uniformly processed RNA-seq data obtained from NCBI Short Read Archive. There are also other examples: ARCHS4 the massive collection of uniformly processed murine and human public transcriptomic datasets [4], recount2 [5] etc. However, the most important task in transcriptomic data harmonization is the correction of batch effects and in general it remains unresolved.…”
Section: Introductionmentioning
confidence: 99%