Big Data Analytics in Genomics 2016
DOI: 10.1007/978-3-319-41279-5_6
|View full text |Cite
|
Sign up to set email alerts
|

State-of-the-Art in Smith–Waterman Protein Database Search on HPC Platforms

Abstract: Searching biological sequence database is a common and repeated task in bioinformatics and molecular biology. The Smith-Waterman algorithm is the most accurate method for this kind of search. Unfortunately, this algorithm is computationally demanding and the situation gets worse due to the exponential growth of biological data in the last years. For that reason, the scientific community Authors Suppressed Due to Excessive Length analyse temporal evolution, contributions, limitations and experimental work and t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
2
1
1

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 34 publications
0
5
0
Order By: Relevance
“…These works reduce SW execution time through the exploitation of High-Performance Computing (HPC) architectures. However, most implementations focus on short sequences, particularly protein sequences [17]. For very long sequences, as the DNA case, few implementations are available.…”
Section: Introductionmentioning
confidence: 99%
“…These works reduce SW execution time through the exploitation of High-Performance Computing (HPC) architectures. However, most implementations focus on short sequences, particularly protein sequences [17]. For very long sequences, as the DNA case, few implementations are available.…”
Section: Introductionmentioning
confidence: 99%
“…The advantage of choosing a GPU lies in two aspects: the performance increment of successive GPU generations and their affordable prices. However, it is important to mention that newer GPU generations do not always provide better performance in the context of sequence alignments using the SW method, such as with CUDASW++ software [5]. Likewise, it has also been observed that CUDAlign does not always provide the best performance rates for small and medium sequence sizes.…”
Section: Resultsmentioning
confidence: 99%
“…The parallelization of SW has been developed in two different alignment contexts: (i) a protein sequence against a genomic database; and (ii) two long DNA sequences. The first scenario involves the construction of a matrix of moderate size which allows the alignment of several independent sequences simultaneously [5]. However, in the context of DNA sequence, this scheme is impracticable due to limited memory resources.…”
Section: Introductionmentioning
confidence: 99%
“…-UniProtKB/Swiss-Prot (release 2016 11) 6 . This database contains 197953409 amino acid residues in 553231 sequences with a maximum length of 35213.…”
Section: Experimental Designmentioning
confidence: 99%
“…In the last few years, the feasibility of using parallel computational devices to improve performance has received considerable attention in bioinformatics. In the context of SW protein alignment, the exploitation of SIMD (Single Instruction Multiple Data) capabilities on modern CPUs has been widely studied [6]. Among the proposals, we can highlight the fastest SSE-based tool SWIPE [7] and its evolution into AVX2 extensions libssa [8].…”
Section: Introductionmentioning
confidence: 99%