2019
DOI: 10.3390/jpm9020018
|View full text |Cite
|
Sign up to set email alerts
|

Aligning the Aligners: Comparison of RNA Sequencing Data Alignment and Gene Expression Quantification Tools for Clinical Breast Cancer Research

Abstract: The rapid expansion of transcriptomics and affordability of next-generation sequencing (NGS) technologies generate rocketing amounts of gene expression data across biology and medicine, including cancer research. Concomitantly, many bioinformatics tools were developed to streamline gene expression and quantification. We tested the concordance of NGS RNA sequencing (RNA-seq) analysis outcomes between two predominant programs for read alignment, HISAT2, and STAR, and two most popular programs for quantifying gen… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
20
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 22 publications
(23 citation statements)
references
References 37 publications
3
20
0
Order By: Relevance
“…The most common software platforms to align RNA sequence to a reference genome are TopHat [68], HiSAT [69], and STAR [70]. These platforms differ with respect to speed, memory usage, and their algorithms for handling pseudogenes [48], base and splice junction alignment precision, with HiSAT and STAR optimized to process large datasets (>10 8 reads), whereas TopHat is designed for smaller datasets (<2 × 10 7 reads).…”
Section: Results Of Transcriptome Analysis: Unbiased Data Mining Tmentioning
confidence: 99%
See 1 more Smart Citation
“…The most common software platforms to align RNA sequence to a reference genome are TopHat [68], HiSAT [69], and STAR [70]. These platforms differ with respect to speed, memory usage, and their algorithms for handling pseudogenes [48], base and splice junction alignment precision, with HiSAT and STAR optimized to process large datasets (>10 8 reads), whereas TopHat is designed for smaller datasets (<2 × 10 7 reads).…”
Section: Results Of Transcriptome Analysis: Unbiased Data Mining Tmentioning
confidence: 99%
“…In eukaryotic organisms, if only protein coding genes are of interest, poly(A) selection yields greater accuracy of transcript quantification [47]. These issues are particularly critical for clinical samples from patients, which are routinely processed as formalin-fixed, paraffin-embedded (FFPE) samples, which adversely impact the quality of RNA and subsequent alignment to pseudogenes [48]. Fortuitously, side-by-side comparison of FFPE and flash-frozen samples shows a great degree of concordance (e.g., r 2 in the range of 0.90–0.97 in recent studies [49,50]), proving RNAseq is a viable tool for gene quantification in clinical settings.…”
Section: Materials and Methods Used In Transcriptomic Studiesmentioning
confidence: 99%
“…The first step of RNA-Seq alignment is curating an organism reference to which the alignment software will map sequence reads. XPRESSpipe uses STAR [53] for mapping reads as it has been shown consistently to be the best performing RNA-Seq read aligner for the majority of cases [54,55]. The appropriate reference files are automatically curated by providing the appropriate GTF file saved as transcripts.gtf and the directory path to the genomic FASTA file(s).…”
Section: Automated Reference Curationmentioning
confidence: 99%
“…Reads are aligned to the reference genome with STAR, which, despite being more memory-intensive, is one of the fastest and most accurate sequence alignment options currently available [53][54][55]. XPRESSpipe is capable of performing a single-pass, splice-aware, GTF-guided alignment or a two-pass alignment of reads wherein novel splice junctions are determined and built into the genome index, followed by alignment of reads using the updated index.…”
Section: Alignmentmentioning
confidence: 99%
See 1 more Smart Citation