Proceedings of the 12th International Workshop on Data Management on New Hardware 2016
DOI: 10.1145/2933349.2933357
|View full text |Cite
|
Sign up to set email alerts
|

SIMD-accelerated regular expression matching

Abstract: String processing tasks are common in analytical queries powering business intelligence. Besides substring matching, provided in SQL by the like operator, popular DBMSs also support regular expressions as selective filters. Substring matching can be optimized by using specialized SIMD instructions on mainstream CPUs, reaching the performance of numeric column scans. However, generic regular expressions are harder to evaluate, being dependent on both the DFA size and the irregularity of the input. Here, we opti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 15 publications
(25 reference statements)
0
7
0
Order By: Relevance
“…The algorithm is applied on the case where the input is matched against a single regular expression with a few hundreds of states and does not scale for the case of multiple pattern matching where we need to access thousands of states for every byte of input. Sitaridi et al [13] use the same hardware gathers as we do, but apply them on database applications where the multiple, independent strings need to be matched against a single regular expression. There have been approaches that use other SIMD instructions for multiple exact pattern matching, but have constraints that make them impractical for the case of Network Intrusion Detection.…”
Section: Simd Approaches To Pattern Matchingmentioning
confidence: 99%
See 1 more Smart Citation
“…The algorithm is applied on the case where the input is matched against a single regular expression with a few hundreds of states and does not scale for the case of multiple pattern matching where we need to access thousands of states for every byte of input. Sitaridi et al [13] use the same hardware gathers as we do, but apply them on database applications where the multiple, independent strings need to be matched against a single regular expression. There have been approaches that use other SIMD instructions for multiple exact pattern matching, but have constraints that make them impractical for the case of Network Intrusion Detection.…”
Section: Simd Approaches To Pattern Matchingmentioning
confidence: 99%
“…section 3) allows even applications with irregular data patterns to gain performance from data parallelism. For example, SIMD can speed up regular expression matching [12,13,14]. Here, the input is matched against a single regular expression at a time, represented by a finite state machine that can fit in L1 or L2 cache.…”
Section: Introductionmentioning
confidence: 99%
“…Let us now turn our attention to data-parallel execution using SIMD operations. There has been extensive research investigating SIMD for database operations [51,50,36,37,38,35,46,44]. It is not surprising that this research generally assumes a vectorized execution model.…”
Section: Data-parallel Execution (Simd)mentioning
confidence: 99%
“…Integrating SIMD processing with database systems has been studied for more than a decade [28]. Several operations, such as selection [12,23], join [2,3,10,26], partitioning [20], sorting [5], CSV parsing [17], regular expression matching [25], and (de-)compression [15,23,27] have been accelerated using the SIMD capabilities of the x86 architectures. In more recent iterations of hardware evolution, SIMD instruction sets have become even more popular in the field of database systems.…”
Section: Introductionmentioning
confidence: 99%