Yu Fang scite author profile

Fast and memory-efficient regular expression matching for deep packet inspection

Fang

¹

,

Chen

²

,

Diao

³

et al. 2006

View full text Add to dashboard Cite

Packet content scanning at high speed has become extremely important due to its applications in network security, network monitoring, HTTP load balancing, etc. In content scanning, the packet payload is compared against a set of patterns specified as regular expressions. In this paper, we first show that memory requirements using traditional methods are prohibitively high for many patterns used in packet scanning applications. We then propose regular expression rewrite techniques that can effectively reduce memory usage. Further, we develop a grouping scheme that can strategically compile a set of regular expressions into several engines, resulting in remarkable improvement of regular expression matching speed without much increase in memory usage. We implement a new DFA-based packet scanner using the above techniques. Our experimental results using real-world traffic and patterns show that our implementation achieves a factor of 12 to 42 performance improvement over a commonly used DFAbased scanner. Compared to the state-of-art NFA-based implementation, our DFA-based packet scanner achieves 50 to 700 times speedup.

show abstract

Algorithms to accelerate multiple regular expressions matching for deep packet inspection

Kumar¹,

Dharmapurikar²,

Fang

³

et al. 2006

SIGCOMM Comput. Commun. Rev.

View full text Add to dashboard Cite

There is a growing demand for network devices capable of examining the content of data packets in order to improve network security and provide application-specific services. Most high performance systems that perform deep packet inspection implement simple string matching algorithms to match packets against a large, but finite set of strings. However, there is growing interest in the use of regular expression-based pattern matching, since regular expressions offer superior expressive power and flexibility. Deterministic finite automata (DFA) representations are typically used to implement regular expressions. However, DFA representations of regular expression sets arising in network applications require large amounts of memory, limiting their practical application.In this paper, we introduce a new representation for regular expressions, called the Delayed Input DFA (D 2 FA), which substantially reduces space requirements as compared to a DFA. A D 2 FA is constructed by transforming a DFA via incrementally replacing several transitions of the automaton with a single default transition. Our approach dramatically reduces the number of distinct transitions between states. For a collection of regular expressions drawn from current commercial and academic systems, a D 2 FA representation reduces transitions by more than 95%. Given the substantially reduced space requirements, we describe an efficient architecture that can perform deep packet inspection at multi-gigabit rates. Our architecture uses multiple on-chip memories in such a way that each remains uniformly occupied and accessed over a short duration, thus effectively distributing the load and enabling high throughput. Our architecture can provide cost-effective packet content scanning at OC-192 rates with memory requirements that are consistent with current ASIC technology.

show abstract

Algorithms to accelerate multiple regular expressions matching for deep packet inspection

Kumar¹,

Dharmapurikar²,

Fang

³

et al. 2006

View full text Add to dashboard Cite

There is a growing demand for network devices capable of examining the content of data packets in order to improve network security and provide application-specific services. Most high performance systems that perform deep packet inspection implement simple string matching algorithms to match packets against a large, but finite set of strings. However, there is growing interest in the use of regular expression-based pattern matching, since regular expressions offer superior expressive power and flexibility. Deterministic finite automata (DFA) representations are typically used to implement regular expressions. However, DFA representations of regular expression sets arising in network applications require large amounts of memory, limiting their practical application.In this paper, we introduce a new representation for regular expressions, called the Delayed Input DFA (D 2 FA), which substantially reduces space requirements as compared to a DFA. A D 2 FA is constructed by transforming a DFA via incrementally replacing several transitions of the automaton with a single default transition. Our approach dramatically reduces the number of distinct transitions between states. For a collection of regular expressions drawn from current commercial and academic systems, a D 2 FA representation reduces transitions by more than 95%. Given the substantially reduced space requirements, we describe an efficient architecture that can perform deep packet inspection at multi-gigabit rates. Our architecture uses multiple on-chip memories in such a way that each remains uniformly occupied and accessed over a short duration, thus effectively distributing the load and enabling high throughput. Our architecture can provide cost-effective packet content scanning at OC-192 rates with memory requirements that are consistent with current ASIC technology.

show abstract

A precision agriculture management system based on Internet of Things and WebGIS

Ye

¹

,

Chen

²

,

Li

³

et al. 2013

View full text Add to dashboard Cite

Yu Fang

Fast and memory-efficient regular expression matching for deep packet inspection

Algorithms to accelerate multiple regular expressions matching for deep packet inspection

Algorithms to accelerate multiple regular expressions matching for deep packet inspection

A precision agriculture management system based on Internet of Things and WebGIS

Contact Info

Product

Resources

About