2019
DOI: 10.1101/682799
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Validating Paired-end Read Alignments in Sequence Graphs

Abstract: Graph based non-linear reference structures such as variation graphs and colored de Bruijn graphs enable incorporation of full genomic diversity within a population. However, transitioning from a simple string-based reference to graphs requires addressing many computational challenges, one of which concerns accurately mapping sequencing read sets to graphs. Paired-end Illumina sequencing is a commonly used sequencing platform in genomics, where the paired-end distance constraints allow disambiguation of repeat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 44 publications
0
9
0
Order By: Relevance
“…Some research has been done on finding solutions for more specific distance queries in sequence graphs. PairG [8] is a method for determining the validity of independent mappings of reads in a pair by deciding whether there is a path between the mappings whose distance is within a given range. This algorithm uses an index to determine if there is a valid path between two vertices in a single O(1) lookup.…”
Section: Prior Researchmentioning
confidence: 99%
“…Some research has been done on finding solutions for more specific distance queries in sequence graphs. PairG [8] is a method for determining the validity of independent mappings of reads in a pair by deciding whether there is a path between the mappings whose distance is within a given range. This algorithm uses an index to determine if there is a valid path between two vertices in a single O(1) lookup.…”
Section: Prior Researchmentioning
confidence: 99%
“…Pairwise alignment dominates our runtime, while sparse matrix construction, which include the creation of both A and A T , and multiplication take only a tiny percentage of our computation, proving the efficiency of our approach for overlap detection. Interestingly, sparse matrix multiplication and semiring abstraction could offer a path for efficient parallelization of many applications in computational biology other than overlap detection (Jain et al, 2019). Figure 8 shows the strong scaling curves of BELLA for the representative P. aeruginosa 30X data set to measure its parallel performance.…”
Section: S9 Experimental Settingmentioning
confidence: 99%
“…Graph representations more accurately reflect the sampled individuals within a population, and their use in genome mapping algorithms reduces reference bias and increases mapping accuracy when sequencing a new individual ( Ballouz et al , 2019 ). There is abundant research on data structures designed for graph representations of genomes and pan-genomes ( Garrison et al , 2018 ; Li et al , 2020 ), their space-efficient indexing ( Chang et al , 2020 ; Ghaffaari and Marschall, 2019 ; Holley et al , 2016 ; Jain et al , 2019b ; Kuhnle et al , 2020 ; Marcus et al , 2014 ; Sirén et al , 2014 ) and alignment algorithms ( Darby et al , 2020 ; Ivanov et al , 2020 ; Jain et al , 2020 ; Kuosmanen et al , 2018 ; Rautiainen and Marschall, 2020 ) to map sequences to reference graphs. For review papers summarizing these developments, see Computational Pan-Genomics Consortium (2018) , Eizenga et al (2020) , and Paten et al (2017) .…”
Section: Introductionmentioning
confidence: 99%