2022
DOI: 10.1101/2022.07.22.501076
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Benchmarking long-read RNA-sequencing analysis tools usingin silicomixtures

Abstract: The current lack of benchmark datasets with inbuilt ground-truth makes it challenging to compare the performance of existing long-read isoform detection and differential expression analysis workflows. Here, we present a benchmark experiment using two human lung adenocarcinoma cell lines that were each profiled in triplicate together with synthetic, spliced, spike-in RNAs ("sequins"). Samples were deeply sequenced on both Illumina short-read and Oxford Nanopore Technologies long-read platforms. Alongside the gr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
34
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 21 publications
(35 citation statements)
references
References 74 publications
1
34
0
Order By: Relevance
“…Using RNA-seq data from the lung adenocarcinoma cell lines, we measured the quantification uncertainty by estimating the mapping ambiguity overdispersion from annotated transcripts. For comparative purposes, we also measured the quantification uncertainty resulting from the quantification of ONT long read libraries of the same human adenocarcinoma cell line populations, which has been previously shown to be small [27]. We observed a strong mapping ambiguity overdispersion trend that increased with the number of annotated transcripts per gene (Figure 1).…”
Section: Mapping Ambiguity Overdispersion Increases With Transcript O...mentioning
confidence: 92%
“…Using RNA-seq data from the lung adenocarcinoma cell lines, we measured the quantification uncertainty by estimating the mapping ambiguity overdispersion from annotated transcripts. For comparative purposes, we also measured the quantification uncertainty resulting from the quantification of ONT long read libraries of the same human adenocarcinoma cell line populations, which has been previously shown to be small [27]. We observed a strong mapping ambiguity overdispersion trend that increased with the number of annotated transcripts per gene (Figure 1).…”
Section: Mapping Ambiguity Overdispersion Increases With Transcript O...mentioning
confidence: 92%
“…Datasets. For all performance testing and analysis in this paper, we used a PCR-cDNA dataset of lung adenocarcinoma cell lines (7) which included synthetic RNA spike-in (sequins (8)), deposited under accession GEO GSE172421. Reads generated with SQK-PCS109 and PBK004 kits on PromethION R9.4.1 flow cells were basecalled using guppy v5.1.13+b292f4d13 with parameters -c dna_r9.4.1_450bps_sup_prom.cfg --trim_strategy none --min_qscore 10 --barcode_kits SQK-PCB109 --do_read_splitting.…”
Section: Methodsmentioning
confidence: 99%
“…As a result, the use of current reference transcript sequences does not provide a complete picture of how a variant affects molecular functioning. Long-read sequencing allows for the accurate elucidation of isoforms (7) and long-read RNA sequencing datasets are proving that the human transcriptome has much more diversity than previously thought (8)(9)(10). In addition, both short and long-read sequencing have shown that gene expression is highly variable in a context dependent manner, e.g.…”
Section: Background/objectivesmentioning
confidence: 99%