2012
DOI: 10.1016/j.jpdc.2011.08.001
|View full text |Cite
|
Sign up to set email alerts
|

A high performance multiple sequence alignment system for pyrosequencing reads from multiple reference genomes

Abstract: Genome resequencing with short reads generated from pyrosequencing generally relies on mapping the short reads against a single reference genome. However, mapping of reads from multiple reference genomes is not possible using a pairwise mapping algorithm. In order to align the reads w.r.t each other and the reference genomes, existing multiple sequence alignment(MSA) methods cannot be used because they do not take into account the position of these short reads with respect to the genome, and are highly ineffic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
5
1
1

Relationship

4
3

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 48 publications
0
9
0
Order By: Relevance
“…Then, we monitor several performance metrics, such as the memory consumption, the amount of network communications, and the computational overhead of the alignment algorithms. Previous work showed that providing high alignment performance is critical for an alignment algorithm to be adopted by the bio-medical community [41].…”
Section: A Experimental Settingsmentioning
confidence: 99%
“…Then, we monitor several performance metrics, such as the memory consumption, the amount of network communications, and the computational overhead of the alignment algorithms. Previous work showed that providing high alignment performance is critical for an alignment algorithm to be adopted by the bio-medical community [41].…”
Section: A Experimental Settingsmentioning
confidence: 99%
“…This gives rise to the field of proteogenomics. The most effective and high-throughput tools for studying genomics and proteomics are next generation sequencing machines (NGS) [18] and mass spectrometers (MS) [19], respectively. Proteogenomics requires integration and analysis of data from both of these high-throughput technologies.…”
Section: Background Informationmentioning
confidence: 99%
“…These machines produce short fragments of DNA or RNA sequences called reads. The sheer volume of data from these machines (3 billion DNA/RNA reads and 0.6TB per run [21]) needs efficient and high-performance computational tools [22] [18]. In order to process the genomic data it is usually mapped to the reference genome.…”
Section: A Big Ngs Data and Computational Challengesmentioning
confidence: 99%
“…3) Load Balancing: Load balancing is one of the most important attributes necessary for performance of a parallel algorithm [25], [26]. Load balancing is important because it ensures that the processors/cores are busy for most of the time the program is running.…”
Section: ) Parallelizing the Main Loop In Algorithmmentioning
confidence: 99%