Summary
Regular expression matching as a core component of deep packet inspection is widely used in various kinds of modern network intrusion detection system, traffic classification system, network monitoring system, and so on. In these systems, regular expressions are typically converted to a deterministic finite automaton (DFA), which takes O(1) to scan each input character. However, DFA generally consumes a large amount of memory. This paper proposes a novel, space‐efficient and time‐efficient DFA presentation, called reduced input character set DFA (RICS‐DFA). A character escaping and replacing scheme is first introduced to decrease the size of DFA's character set and then to reduce DFA's space requirement with a series of optimization techniques. Based on transition rewriting, a RICS‐DFA constructing algorithm with time complexity of O(n) is presented in this paper. For real rule‐sets, RICS‐DFA reduces the memory consumption by 68–92%, compared with the original DFA. Finally, this paper designs a scalable RICS‐DFA matching engine on field‐programmable gate array platform in which the reduced state transition matrix is mapped to on‐chip memories. The throughput of executing deep packet inspection for real rule‐sets can achieve 7–50.5 Gbps. Copyright © 2016 John Wiley & Sons, Ltd.