Experiences with String Matching on the Fermi Architecture

Tumeo, Antonino; Secchi, Simone; Villa, Oreste

doi:10.1007/978-3-642-19137-4_3

Cited by 16 publications

(19 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our scheme can sustain a 6.49 Gbit/s throughput for small packets, and a 29.7 Gbit/s for full-payload packets. Comparing with the Tesla C2050 throughput, reported in [16], our implementation is about three times faster.…”

Section: Related Workmentioning

confidence: 75%

“…Due to their high entropy, random workloads can be assumed quite representative for evaluating pattern matching [16]; specific scenarios, like virus detection or traffic classification may perform better, due to the lower entropy between the scanned content and the patterns themself.…”

Section: Performance Evaluationmentioning

confidence: 99%

“…In order to speed-up the pattern matching computation on the GPU, Smith et al [14] and Tumeo et al [16,17], redesigned the packet reading process, such that each thread is fetching four bytes at a time, instead of one. Since the input symbols belong to the ASCII alphabet, they are represented with 8 bits.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Parallelization and characterization of pattern matching using GPUs

Vasiliadis

Polychronakis

Ioannidis

2011

2011 IEEE International Symposium on Workload Characterization (IISWC)

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 75%

Section: Performance Evaluationmentioning

confidence: 99%

See 1 more Smart Citation

Parallelization and characterization of pattern matching using GPUs

Vasiliadis

Polychronakis

Ioannidis

2011

2011 IEEE International Symposium on Workload Characterization (IISWC)

View full text Add to dashboard Cite

show abstract

“…Some researchers used graphics processing units (GPUs) to accelerate NIDSs. The underutilized computational power of GPUs has been used to accelerate NIDSs such as Snort by offloading patternmatching process into GPUs [19][20][21][22][23]. In [22], GPU is used to accelerate pattern matching in NIDS using an efficient pattern matching algorithm based on hierarchical hash table architecture.…”

Section: Related Workmentioning

confidence: 99%

Accelerating snort NIDS using NetFPGA-based Bloom filter

Al-Dalky

Salah

Otrok

et al. 2014

2014 International Wireless Communications and Mobile Computing Conference (IWCMC)

View full text Add to dashboard Cite

In recent years, network intrusion detection systems (NIDS) have faced a serious throughput challenge as a result of the rapid increase of network links to 1 and 10 Gbps rates. Consequently, this calls for NIDS to have wire-speed packet processing and real-time detection of malicious traffic. Snort is the most popular NIDS. Snort is an open source software-based NIDS and runs as a single threaded application. Snort processing and detection capabilities can be limited in networks with 1 and 10 Gbps network links. To overcome such a limitation, we present a design and implementation of two layer NIDS for accelerating Snort detection. The design combines hardware and software components whereby Snort operates as the second line of defense after hardware-assisted inspection of packet headers. In our design, Snort's frequently used rules are offloaded from Snort to a NetFPGA-based hardware layer. The NetFPGA implementation is based on Bloom filter to analyze and filter incoming packets with header fields matching those of frequently used rules. The second line of defense will dynamically offload the most frequently triggered rules to the NetFPGA and will only be executed if deep packet analysis is required for the incoming packet. The experimental results show a significant improvement in the CPU usage and an enormous reduction in packet loss when using Snort with NetFPGA filtering.

show abstract

“…GPU implementations of the Aho-Corasick algorithm have been proposed earlier [17], [30], [39] and a GPU implementation of the Boyer-Moore multistring matching algorithm is described in [39]. Lin et al [17] and Tumeo et al [30] consider the host-to-host case in which the target string begins in the host CPU and the pattern matches are to be brought back from the GPU to the host CPU.…”

Section: Introductionmentioning

confidence: 98%

GPU-to-GPU and Host-to-Host Multipattern String Matching on a GPU

Zha

Sahni

2013

IEEE Trans. Comput.

View full text Add to dashboard Cite

We develop GPU adaptations of the Aho-Corasick and multipattern Boyer-Moore string matching algorithms for the two cases GPUto-GPU (input to the algorithms is initially in GPU memory and the output is left in GPU memory) and host-to-host (input and output are in the memory of the host CPU). For the GPU-to-GPU case, we consider several refinements to a base GPU implementation and measure the performance gain from each refinement. For the host-to-host case, we analyze two strategies to communicate between the host and the GPU and show that one is optimal with respect to run time while the other requires less device memory. This analysis is done for GPUs with one I/I channel to the host as well as those with 2. Experiments conducted on an NVIDIA Tesla GT200 GPU that has 240 cores running off of a Xeon 2.8GHz quad-core host CPU show that, for the GPU-to-GPU case, our Aho-Corasick GPU adaptation achieves a speedup between 8.5 and 9.5 relative to a single-thread CPU implementation and between 2.4 and 3.2 relative to the best multithreaded implementation. For the host-tohost case, the GPU AC code achieves a speedup of 3.1 relative to a single-threaded CPU implementation. However, the GPU is unable to deliver any speedup relative to the best multithreaded code running on the quad-core host. In fact, the measured speedups for the latter case ranged between 0.74 and 0.83. Early versions of our multipattern BoyerMoore adaptations ran 7% to 10% slower than corresponding versions of the AC adaptations and we did not refine the multipattern BoyerMoore codes further.

show abstract

Experiences with String Matching on the Fermi Architecture

Cited by 16 publications

References 14 publications

Parallelization and characterization of pattern matching using GPUs

Parallelization and characterization of pattern matching using GPUs

Accelerating snort NIDS using NetFPGA-based Bloom filter

GPU-to-GPU and Host-to-Host Multipattern String Matching on a GPU

Contact Info

Product

Resources

About