GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping

Alser, Mohammed; Hassan, Hasan; Xin, Hongyi; Ergin, Oğuz; Mutlu, Onur; Alkan, Can

doi:10.1093/bioinformatics/btx342

Cited by 98 publications

(103 citation statements)

References 46 publications

Supporting

Mentioning

103

Contrasting

Order By: Relevance

“…The way we build our neighborhood map ensures that computing each of its entries is independent of every other, and thus the entire map can be computed all at once in a parallel fashion. Hence, our neighborhood map is well suited for highly parallel computing platforms (Alser et al, 2017a;Seshadri et al, 2017). Note that in sequence alignment algorithms, computing each entry of the dynamic programing matrix depends on the values of the immediate left, upper left and upper entries of its own.…”

Section: Building the Neighborhood Mapmentioning

confidence: 99%

“…GRIM-Filter (Kim et al, 2018) exploits the high memory bandwidth and the logic layer of 3D-stacked memory to perform highly-parallel filtering in the DRAM chip itself. GateKeeper (Alser et al, 2017a) is designed to utilize the large amounts of parallelism offered by FPGA architectures. MAGNET (Alser et al, 2017b) shows a low number of falsely accepted sequence pairs but its current implementation is much slower than that of SHD or GateKeeper.…”

Section: Introductionmentioning

confidence: 99%

“…MAGNET (Alser et al, 2017b) shows a low number of falsely accepted sequence pairs but its current implementation is much slower than that of SHD or GateKeeper. GateKeeper (Alser et al, 2017a) provides a high filtering speed but suffers from relatively high number of falsely accepted sequence pairs.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Shouji: a fast and efficient pre-alignment filter for sequence alignment

et al. 2019

Self Cite

View full text Add to dashboard Cite

Motivation The ability to generate massive amounts of sequencing data continues to overwhelm the processing capability of existing algorithms and compute infrastructures. In this work, we explore the use of hardware/software co-design and hardware acceleration to significantly reduce the execution time of short sequence alignment, a crucial step in analyzing sequenced genomes. We introduce Shouji, a highly parallel and accurate pre-alignment filter that remarkably reduces the need for computationally-costly dynamic programming algorithms. The first key idea of our proposed pre-alignment filter is to provide high filtering accuracy by correctly detecting all common subsequences shared between two given sequences. The second key idea is to design a hardware accelerator that adopts modern field-programmable gate array (FPGA) architectures to further boost the performance of our algorithm. Results Shouji significantly improves the accuracy of pre-alignment filtering by up to two orders of magnitude compared to the state-of-the-art pre-alignment filters, GateKeeper and SHD. Our FPGA-based accelerator is up to three orders of magnitude faster than the equivalent CPU implementation of Shouji. Using a single FPGA chip, we benchmark the benefits of integrating Shouji with five state-of-the-art sequence aligners, designed for different computing platforms. The addition of Shouji as a pre-alignment step reduces the execution time of the five state-of-the-art sequence aligners by up to 18.8×. Shouji can be adapted for any bioinformatics pipeline that performs sequence alignment for verification. Unlike most existing methods that aim to accelerate sequence alignment, Shouji does not sacrifice any of the aligner capabilities, as it does not modify or replace the alignment step. Availability and implementation https://github.com/CMU-SAFARI/Shouji. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Section: Building the Neighborhood Mapmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Shouji: a fast and efficient pre-alignment filter for sequence alignment

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…Accelerating Pre-Alignment. A very recent prior work [11] implements a seed location filter in an FPGA, and shows significant speedup against prior filters. However, as shown in that work, the FPGA is still limited by the memory bandwidth bottleneck.…”

Section: Related Workmentioning

confidence: 99%

“…With the advent of seed location filters, the performance bottleneck of DNA read mapping has shifted from sequence alignment to seed location filtering [10,11,98,100]. Unfortunately, a seed location filter requires large amounts of memory bandwidth to process and characterize each of the candidate locations.…”

Section: Introductionmentioning

confidence: 99%

GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies

et al. 2018

Self Cite

View full text Add to dashboard Cite

BackgroundSeed location filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. State-of-the-art read mappers 1) quickly generate possible mapping locations for seeds (i.e., smaller segments) within each read, 2) extract reference sequences at each of the mapping locations, and 3) check similarity between each read and its associated reference sequences with a computationally-expensive algorithm (i.e., sequence alignment) to determine the origin of the read. A seed location filter comes into play before alignment, discarding seed locations that alignment would deem a poor match. The ideal seed location filter would discard all poor match locations prior to alignment such that there is no wasted computation on unnecessary alignments.ResultsWe propose a novel seed location filtering algorithm, GRIM-Filter, optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM). GRIM-Filter quickly filters seed locations by 1) introducing a new representation of coarse-grained segments of the reference genome, and 2) using massively-parallel in-memory operations to identify read presence within each coarse-grained segment. Our evaluations show that for a sequence alignment error tolerance of 0.05, GRIM-Filter 1) reduces the false negative rate of filtering by 5.59x–6.41x, and 2) provides an end-to-end read mapper speedup of 1.81x–3.65x, compared to a state-of-the-art read mapper employing the best previous seed location filtering algorithm.ConclusionGRIM-Filter exploits 3D-stacked memory, which enables the efficient use of processing-in-memory, to overcome the memory bandwidth bottleneck in seed location filtering. We show that GRIM-Filter significantly improves the performance of a state-of-the-art read mapper. GRIM-Filter is a universal seed location filter that can be applied to any read mapper. We hope that our results provide inspiration for new works to design other bioinformatics algorithms that take advantage of emerging technologies and new processing paradigms, such as processing-in-memory using 3D-stacked memory devices.

show abstract

In‐Memory Computing with Memristor Content Addressable Memories for Pattern Matching

Graves

Sheng

et al. 2020

Advanced Materials

View full text Add to dashboard Cite

The dramatic rise of data‐intensive workloads has revived application‐specific computational hardware for continuing speed and power improvements, frequently achieved by limiting data movement and implementing “in‐memory computation”. However, conventional complementary metal oxide semiconductor (CMOS) circuit designs can still suffer low power efficiency, motivating designs leveraging nonvolatile resistive random access memory (ReRAM), and with many studies focusing on crossbar circuit architectures. Another circuit primitive—content addressable memory (CAM)—shows great promise for mapping a diverse range of computational models for in‐memory computation, with recent ReRAM–CAM designs proposed but few experimentally demonstrated. Here, programming and control of memristors across an 86 × 12 memristor ternary CAM (TCAM) array integrated with CMOS are demonstrated, and parameter tradeoffs for optimizing speed and search margin are evaluated. In addition to smaller area, this memristor TCAM results in significantly lower power due to very low programmable conductance states, motivating CAM use in a wider range of computational applications than conventional TCAMs are confined to today. Finally, the first experimental demonstration of two computational models in memristor TCAM arrays is reported: regular expression matching in a finite state machine for network security intrusion detection and definable inexact pattern matching in a Levenshtein automata for genomic sequencing.

show abstract

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping

Cited by 98 publications

References 46 publications

Shouji: a fast and efficient pre-alignment filter for sequence alignment

Shouji: a fast and efficient pre-alignment filter for sequence alignment

GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies

In‐Memory Computing with Memristor Content Addressable Memories for Pattern Matching

Contact Info

Product

Resources

About