2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2015
DOI: 10.1109/iccad.2015.7372576
|View full text |Cite
|
Sign up to set email alerts
|

Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
28
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
2
2

Relationship

3
5

Authors

Journals

citations
Cited by 45 publications
(28 citation statements)
references
References 14 publications
0
28
0
Order By: Relevance
“…By offloading the computational bottleneck onto Virtex-7 XC7VX690T-2 FPGA, the entire system can deliver a total acceleration of about 45%. This work is later extended by Ahmed et al [91] where a hardware suffix array is used to partially accelerate SMEM generation, which enables a total application acceleration of 2.6× compared to the original software version.…”
Section: Mappingmentioning
confidence: 99%
“…By offloading the computational bottleneck onto Virtex-7 XC7VX690T-2 FPGA, the entire system can deliver a total acceleration of about 45%. This work is later extended by Ahmed et al [91] where a hardware suffix array is used to partially accelerate SMEM generation, which enables a total application acceleration of 2.6× compared to the original software version.…”
Section: Mappingmentioning
confidence: 99%
“…To the authors' knowledge, only a few accelerated implementations of BWA-MEM exist: two FPGA implementations of BWA-MEM on the Convey supercomputing platform: one offloading the Seed Extension phase onto four Xilinx Virtex-6 FPGAs [4] obtaining a 1.5x speedup, the other accelerating multiple BWA-MEM phases [1] obtaining a 2.6x speedup; and a GPU-accelerated implementation of the Seed Extension phase [5], achieving a 1.6x speedup. This work improves upon [5], obtaining far better results: a two-fold speedup for a system with up to twenty-two logical cores is obtained, compared to an at most 1.6x speedup for a system with up to four cores.…”
Section: Related Workmentioning
confidence: 99%
“…To achieve this, it makes use of the Seed-and-Extend paradigm (refer to Figure 1), a two-step method consisting of an Exact Matching phase and an Inexact Matching phase (for details, see [1]). First, for each short read Seed Generation is performed: exactly matching subsequences of the read and reference called seeds are identified using a Burrows-Wheeler Transform-based index.…”
Section: The Bwa-mem Algorithmmentioning
confidence: 99%
“…To our knowledge the only application-level accelerated integrated implementations of BWA-MEM that exist are: an FPGA-accelerated implementation of the Seed Extension phase [15] achieving a 1.5x speedup, further improved in [16] for an overall 2.6x speedup; and a GPU implementation [9], further improved to achieve an up to 2x speedup [17]. The FPGA implementation used here builds on [15], and a comparison of the implementation here is made to the improved GPU implementation.…”
Section: Related Workmentioning
confidence: 99%
“…A significant difference to the design in [15] and [16] is the fact that the Alpha Data card used here contains only a single Virtex-7 FPGA, whereas [15] and [16] use the Convey HC-2 EX as implementation platform, which contains four userconfigurable Virtex-6 FPGAs. As the design here is limited by the amount of LUTs available, and the Virtex-7 FPGA on the Alpha Data card contains 432,368 LUTs versus 474,240 LUTs per Virtex-6 FPGA on the Convey, this means only about 23% of the resources are available as compared to the Convey platform.…”
Section: A Fpga Design and Implementationmentioning
confidence: 99%