Zhenxian Zheng scite author profile

Deep learning-based variant callers are becoming the standard and have achieved superior SNP calling performance using long reads. In this paper, we present Clair3, which makes the best of two major method categories: pile-up calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 ran faster than any of the other state-ofthe-art variant callers and performed the best, especially at lower coverage. Main TextThe rst preprint of DeepVariant 1 was released in late 2016, marking the beginning of the use of deep learning-based methods (DL methods) instead of traditional statistical methods for variant calling. Over the years, several DL methods have been developed. We are now witnessing a complete take-over, led by DeepVariant for short-read variant calling. Long-read variant calling, using Oxford Nanopore (ONT) data, on the other hand, has been dominated by DL-methods since the beginning, primarily owing to the di culty caused by its higher base error rate in general. Although the DL methods for short-read and longread have a lot in common, the problem of long-read variant calling is considered more di cult. This led to two major designs -using pileup or full-alignment as the input of the decision-making neural network -which are signi cantly different in both performance and speed. Long-read variant callers, including Clairvoyante 2 , Clair 3 , and Nanocaller 4 , are pileup-based, in which the read alignments are summarized into features and counts before being inputted into a variant calling network. PEPPER-Margin-DeepVariant 5 (PEPPER) is full alignment-based. The input to the DeepVariant variant calling network is kept with spatial information in the read alignments and is tens of times larger than the pileup inputs in terms of size. Medaka 6 is consensus-based; it uses pileup input to generate a diploid consensus in the rst iteration and two haploid consensuses in the second. The differences between the reference and consensuses are identi ed and combined into variants. These are all state-of-the-art algorithms; the pileup-based algorithms are usually superior in terms of time e ciency and the full-alignment algorithms provide the best precision and recall. However, while the two designs are not mutually exclusive, there have not been any studies combining pileup calling and full-alignment calling.To ll the gap, we developed Clair3, the successor to Clair, which makes the best of both designs. It runs as fast as the pileup-based callers and performs as well as the full alignment-based callers. Supplementary Figure 1 shows the work ow for Clair3. The philosophy behind Clair3 is to trust the fullalignment model unless the pileup model can make a quick but reliable decision. First, the pileup calling network goes through all the variant candidates that passed a coverage threshold and an alternative allele frequency threshold. Next, the high-quality pileup calls are used to phase the alignments and as part of the nal output. Then, ...

show abstract

Symphonizing pileup and full-alignment for deep learning-based long-read variant calling

Zheng

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Applications and potentials of nanopore sequencing in the (epi)genome and (epi)transcriptome era

et al. 2021

View full text Add to dashboard Cite

Realistic action recognition with salient foreground trajectories

Zheng

Lin

2017

Expert Systems with Applications

View full text Add to dashboard Cite

Clair3-trio: high-performance Nanopore long-read variant calling in family trios with trio-to-trio deep neural networks

Zheng

Ahmed

et al. 2022

View full text Add to dashboard Cite

Accurate identification of genetic variants from family child–mother–father trio sequencing data is important in genomics. However, state-of-the-art approaches treat variant calling from trios as three independent tasks, which limits their calling accuracy for Nanopore long-read sequencing data. For better trio variant calling, we introduce Clair3-Trio, the first variant caller tailored for family trio data from Nanopore long-reads. Clair3-Trio employs a Trio-to-Trio deep neural network model, which allows it to input the trio sequencing information and output all of the trio’s predicted variants within a single model to improve variant calling. We also present MCVLoss, a novel loss function tailor-made for variant calling in trios, leveraging the explicit encoding of the Mendelian inheritance. Clair3-Trio showed comprehensive improvement in experiments. It predicted far fewer Mendelian inheritance violation variations than current state-of-the-art methods. We also demonstrated that our Trio-to-Trio model is more accurate than competing architectures. Clair3-Trio is accessible as a free, open-source project at https://github.com/HKU-BAL/Clair3-Trio.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhenxian Zheng

Symphonizing pileup and full-alignment for deep learning-based long-read variant calling

Symphonizing pileup and full-alignment for deep learning-based long-read variant calling

Applications and potentials of nanopore sequencing in the (epi)genome and (epi)transcriptome era

Realistic action recognition with salient foreground trajectories

Clair3-trio: high-performance Nanopore long-read variant calling in family trios with trio-to-trio deep neural networks

Contact Info

Product

Resources

About