State-of-the-Art in Smith–Waterman Protein Database Search on HPC Platforms

Rucci, Enzo; Garcı́a, Carlos; Botella, Guillermo; Giusti, Armando De; Naiouf, Marcelo; Prieto-Matías, Manuel

doi:10.1007/978-3-319-41279-5_6

Cited by 4 publications

(5 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These works reduce SW execution time through the exploitation of High-Performance Computing (HPC) architectures. However, most implementations focus on short sequences, particularly protein sequences [17]. For very long sequences, as the DNA case, few implementations are available.…”

Section: Introductionmentioning

confidence: 99%

Accelerating Smith-Waterman Alignment of Long DNA Sequences with OpenCL on FPGA

Rucci

Garcı́a

Botella

et al. 2017

Bioinformatics and Biomedical Engineering

Self Cite

View full text Add to dashboard Cite

With the greater importance of parallel architectures such as GPUs or Xeon Phi accelerators, the scientific community has developed efficient solutions in the bioinformatics field. In this context, FPGAs begin to stand out as high performance devices with moderate power consumption. This paper presents and evaluates a parallel strategy of the well-known Smith-Waterman algorithm using OpenCL on Intel/Altera's FPGA for long DNA sequences. We efficiently exploit data and pipeline parallelism on a Intel/Altera Stratix V FPGA reaching upto 114 GCUPS in less than 25 watt power requirements.

show abstract

Section: Introductionmentioning

confidence: 99%

Accelerating Smith-Waterman Alignment of Long DNA Sequences with OpenCL on FPGA

Rucci

Garcı́a

Botella

et al. 2017

Bioinformatics and Biomedical Engineering

Self Cite

View full text Add to dashboard Cite

show abstract

“…The advantage of choosing a GPU lies in two aspects: the performance increment of successive GPU generations and their affordable prices. However, it is important to mention that newer GPU generations do not always provide better performance in the context of sequence alignments using the SW method, such as with CUDASW++ software [5]. Likewise, it has also been observed that CUDAlign does not always provide the best performance rates for small and medium sequence sizes.…”

Section: Resultsmentioning

confidence: 99%

“…The parallelization of SW has been developed in two different alignment contexts: (i) a protein sequence against a genomic database; and (ii) two long DNA sequences. The first scenario involves the construction of a matrix of moderate size which allows the alignment of several independent sequences simultaneously [5]. However, in the context of DNA sequence, this scheme is impracticable due to limited memory resources.…”

Section: Introductionmentioning

confidence: 99%

SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

et al. 2018

Self Cite

View full text Add to dashboard Cite

BackgroundThe Smith-Waterman (SW) algorithm is the best choice for searching similar regions between two DNA or protein sequences. However, it may become impracticable in some contexts due to its high computational demands. Consequently, the computer science community has focused on the use of modern parallel architectures such as Graphics Processing Units (GPUs), Xeon Phi accelerators and Field Programmable Gate Arrays (FGPAs) to speed up large-scale workloads.ResultsThis paper presents and evaluates SWIFOLD: a Smith-Waterman parallel Implementation on FPGA with OpenCL for Long DNA sequences. First, we evaluate its performance and resource usage for different kernel configurations. Next, we carry out a performance comparison between our tool and other state-of-the-art implementations considering three different datasets. SWIFOLD offers the best average performance for small and medium test sets, achieving a performance that is independent of input size and sequence similarity. In addition, SWIFOLD provides competitive performance rates in comparison with GPU-based implementations on the latest GPU generation for the large dataset.ConclusionsThe results suggest that SWIFOLD can be a serious contender for accelerating the SW alignment of DNA sequences of unrestricted size in an affordable way reaching on average 125 GCUPS and almost a peak of 270 GCUPS.

show abstract

“…-UniProtKB/Swiss-Prot (release 2016 11) 6 . This database contains 197953409 amino acid residues in 553231 sequences with a maximum length of 35213.…”

Section: Experimental Designmentioning

confidence: 99%

“…In the last few years, the feasibility of using parallel computational devices to improve performance has received considerable attention in bioinformatics. In the context of SW protein alignment, the exploitation of SIMD (Single Instruction Multiple Data) capabilities on modern CPUs has been widely studied [6]. Among the proposals, we can highlight the fastest SSE-based tool SWIPE [7] and its evolution into AVX2 extensions libssa [8].…”

Section: Introductionmentioning

confidence: 99%

SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions

Rucci

Sánchez

Juan

et al. 2018

Int J Parallel Prog

Self Cite

View full text Add to dashboard Cite

The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments, but its acceptance is limited by the computational requirements for large protein databases. Although the acceleration of SW has already been studied on many parallel platforms, there are hardly any studies which take advantage of the latest Intel architectures based on AVX-512 vector extensions. This SIMD set is currently supported by Intel's Knights Landing (KNL) accelerator and Intel's Skylake (SKL) general purpose processors. In this paper, we present an SW version that is optimized for both architectures: the renowned SWIMM 2.0. The novelty of this vector instruction set requires the revision of previous programming and optimization techniques. SWIMM 2.0 is based on a massive multi-threading and SIMD exploitation. It is competitive in terms of performance compared with other state-of-the-art implementations, reaching 511 GCUPS on a single KNL node and 734 GCUPS on a server equipped with a dual SKL processor. Moreover, these successful performance rates make SWIMM 2.0 the most efficient energy footprint implementation in this study achieving 2.94 GCUPS/Watts on the SKL processor.

show abstract

State-of-the-Art in Smith–Waterman Protein Database Search on HPC Platforms

Cited by 4 publications

References 34 publications

Accelerating Smith-Waterman Alignment of Long DNA Sequences with OpenCL on FPGA

Accelerating Smith-Waterman Alignment of Long DNA Sequences with OpenCL on FPGA

SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions

Contact Info

Product

Resources

About