2022
DOI: 10.1101/2022.12.09.519749
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering

Abstract: Basecalling is an essential step in nanopore sequencing analysis where the raw signals of nanopore sequencers are converted into nucleotide sequences, i.e., reads. State-of-the-art basecallers employ complex deep learning models to achieve high basecalling accuracy. This makes basecalling computationally-inefficient and memory-hungry; bottlenecking the entire genome analysis pipeline. However, for many applications, the majority of reads do no match the reference genome of interest (i.e., target reference) and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(10 citation statements)
references
References 73 publications
0
10
0
Order By: Relevance
“…Although we do not evaluate in this work, we expect that RawHash can be used as a low-cost filter to eliminate the reads that are unlikely to be useful in downstream analysis, which can reduce the overall workload of basecallers and further downstream analysis. We believe that RawHash can be applied for such a filtering purpose since a previous work [39] proposes a lightweight basecalling solution to use as a filter before using the costly basecallers. Limitations.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Although we do not evaluate in this work, we expect that RawHash can be used as a low-cost filter to eliminate the reads that are unlikely to be useful in downstream analysis, which can reduce the overall workload of basecallers and further downstream analysis. We believe that RawHash can be applied for such a filtering purpose since a previous work [39] proposes a lightweight basecalling solution to use as a filter before using the costly basecallers. Limitations.…”
Section: Discussionmentioning
confidence: 99%
“…All the above works accelerate the basecalling step without eliminating the wasted computation in basecalling. TargetCall [39] proposes a pre-basecalling filter that eliminates the wasted computation in basecalling by leveraging the observation that the majority of reads are discarded after basecalling. However, RawHash is different from these works as its goal is to perform real-time analysis of raw signals without performing the computationally-intensive basecalling step.…”
Section: Related Workmentioning
confidence: 99%
“…Previous studies have used lightweight basecallers for faster adaptive sampling [13]. Our picoamp binning procedure could be considered a very simple "basecalling" step.…”
Section: Discussionmentioning
confidence: 99%
“…However, these networks comprise millions of parameters and are computationally infeasible to run without hardware acceleration [12]. Even using acceleration, basecalling is the rate-limiting step for read classification algorithms [13]. One study found that over 95% of the compute time in a variant calling pipeline for SARS-CoV2 was spent in basecalling [14].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation