Graph algorithms for DNA sequencing – origins, current models and the future

Błażewicz, Jacek; Kasprzak, Marta; Kierzynka, Michał; Frohmberg, Wojciech; Świercz, Aleksandra; Wojciechowski, Pawel; Zurkowski, Piotr

doi:10.1016/j.ejor.2016.06.043

Cited by 18 publications

(5 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, they highlight the key developments in sequencing and provide predictions about how these may affect computational models in the future. [20]. Graph algorithms for DNA sequencing-origins, current models and the future • Graph algorithms for DNA sequencing…”

Section: Literature Reviewmentioning

confidence: 99%

Computational Modelling: A Study of Different DNA Sequencing Using DNA Graphs

Gomathy¹,

T²

2023

IJSREM

View full text Add to dashboard Cite

Natural genetic material may help identify genetic abnormalities and provide insight into the workings of gene expression systems. Disorders associated with chromosomal abnormalities include single nucleotide polymorphisms (SNPs), minor insertions and deletions, and significant chromosomal aberrations. In order to analyse DNA sequences, one of the most important components of biological study, various techniques have been used. Thus, DNA analysis and computing have benefited greatly from a variety of mathematical and algorithmic advances. Sequencing systems are constructed on a quantitative framework, and their workings include cost minimization, deployment, and sensitivity analysis for various parameters. In order to analyse various diseases using DNA, this study will look into the role of DNA sequencing and how it is represented in graphs. Keywords: DNA graphs; DNA sequencing; DNA library; genetic diseases; graph theory

show abstract

Section: Literature Reviewmentioning

confidence: 99%

Computational Modelling: A Study of Different DNA Sequencing Using DNA Graphs

Gomathy¹,

T²

2023

IJSREM

View full text Add to dashboard Cite

show abstract

“…The main features that distinguish NGS from Sangers sequencing are highly parallel, micro scale, fast, shorter lengths and low-cost 10 . Their technique substantially reduces the cost of producing short DNA reads from 50 to 700 bp, and has opened up the possibility for an affordable sequencing of whole genomes 11 .…”

Section: Introductionmentioning

confidence: 99%

Interval graphs for Genome-Fragment combinations

KAO,

2023

International Conference on Mathematical and Statistical Physics, Computational Science, Education and Communication (ICMSCE 20

View full text Add to dashboard Cite

“…The overlap graph model is a straightforward conceptualization of the real-world process and works well as long as the sequencing data are not too large. The necessity of representing whole sequences in the graph and calculating inexact sequence alignments for most of pairs of the sequences makes the computational process very time and memory consuming ( Blazewicz et al , 2018 ). The literature reports cases where assemblers from this group did not finish computations for greater datasets because of excessive memory requirements ( Gonnella and Kurtz, 2012 ; Kajitani et al , 2014 ).…”

Section: Introductionmentioning

confidence: 99%

“…A gain in efficiency of computations is achieved by a much lower volume of stored information and a smaller traversed graph, but mainly by discarding inexact matches. On the other hand, quality of resulting contigs diminishes a bit due to the sequence decomposition to k -mers, as the information about whole reads is partially lost ( Blazewicz et al , 2018 ).…”

Section: Introductionmentioning

confidence: 99%

Genome-scale de novo assembly using ALGA

et al. 2021

Self Cite

View full text Add to dashboard Cite

Motivation There are very few methods for de novo genome assembly based on the overlap graph approach. It is considered as giving more exact results than the so-called de Bruijn graph approach but in much greater time and of much higher memory usage. It is not uncommon that assembly methods involving the overlap graph model are not able to successfully compute greater data sets, mainly due to memory limitation of a computer. This was the reason for developing in last decades mainly de Bruijn-based assembly methods, fast and fairly accurate. However, the latter methods can fail for longer or more repetitive genomes, as they decompose reads to shorter fragments and lose a part of information. An efficient assembler for processing big data sets and using the overlap graph model is still looked out. Results We propose a new genome-scale de novo assembler based on the overlap graph approach, designed for short-read sequencing data. The method, ALGA, incorporates several new ideas resulting in more exact contigs produced in short time. Among these ideas we have creation of a sparse but quite informative graph, reduction of the graph including a procedure referring to the problem of minimum spanning tree of a local subgraph, and graph traversal connected with simultaneous analysis of contigs stored so far. What is rare in genome assembly, the algorithm is almost parameter-free, with only one optional parameter to be set by a user. ALGA was compared with nine state-of-the-art assemblers in tests on genome-scale sequencing data obtained from real experiments on six organisms, differing in size, coverage, GC content, and repetition rate. ALGA produced best results in the sense of overall quality of genome reconstruction, understood as a good balance between genome coverage, accuracy, and length of resulting sequences. The algorithm is one of tools involved in processing data in currently realized national project Genomic Map of Poland. Availability ALGA is available at http://alga.put.poznan.pl. Supplementary information Supplementary material is available at Bioinformatics online.

show abstract

Graph algorithms for DNA sequencing – origins, current models and the future

Cited by 18 publications

References 52 publications

Computational Modelling: A Study of Different DNA Sequencing Using DNA Graphs

Computational Modelling: A Study of Different DNA Sequencing Using DNA Graphs

Interval graphs for Genome-Fragment combinations

Genome-scale de novo assembly using ALGA

Contact Info

Product

Resources

About