2019
DOI: 10.1101/gr.234443.118
|View full text |Cite
|
Sign up to set email alerts
|

Resolving the full spectrum of human genome variation using Linked-Reads

Abstract: Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range info… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
123
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 206 publications
(127 citation statements)
references
References 40 publications
3
123
1
Order By: Relevance
“…First, the reads were aligned to the bovine reference genome (UMD 3.1) using the Lariat aligner; subsequently, SNPs and insertion-deletion polymorphisms (indels) were called using the GATK mode (version 3.8) [26] within Long Ranger pipeline. The identified SNPs (Single nucleotide polymorphisms) were phased using two methods: first, using the phasing method implemented in Long Ranger, which builds on the Markov chain Monte Carlo (MCMC) algorithm-based phasing method proposed by Bansal et al [27] by extending the probabilistic model to be robust to mixed fragments containing alleles from both haplotypes [28]. In the second method, the variants (SNPs) were phased using Hapcut2 [18] using default options.…”
Section: Read Assembly and Haplotype Phasingmentioning
confidence: 99%
“…First, the reads were aligned to the bovine reference genome (UMD 3.1) using the Lariat aligner; subsequently, SNPs and insertion-deletion polymorphisms (indels) were called using the GATK mode (version 3.8) [26] within Long Ranger pipeline. The identified SNPs (Single nucleotide polymorphisms) were phased using two methods: first, using the phasing method implemented in Long Ranger, which builds on the Markov chain Monte Carlo (MCMC) algorithm-based phasing method proposed by Bansal et al [27] by extending the probabilistic model to be robust to mixed fragments containing alleles from both haplotypes [28]. In the second method, the variants (SNPs) were phased using Hapcut2 [18] using default options.…”
Section: Read Assembly and Haplotype Phasingmentioning
confidence: 99%
“…Similar technologies have recently been further optimized and commercialized by 10× Genomics Inc. (Weisenfeld et al, 2017). These technologies have already provided haplotype aware SR-WGS (Zheng et al, 2016) and improved SV calling from SR-WGS data (Nazaryan-Petersen et al, 2018; Marks et al, 2019). While synthetic long-reads leverage advantages of LRS, some short-read issues may persist, e.g., PCR-bias and intra-read complexity may not always be fully resolved.…”
Section: Introductionmentioning
confidence: 99%
“…to process stLFR reads for human germline variant calling and phasing 37 . Using a high accuracy pipeline to call variants of CCS long reads, which was also used in the previous study of human HG002/NA24385 with high precision and recall values of variant-calling 38 .…”
Section: Discussionmentioning
confidence: 99%