Decision tree (DT)-based machine learning (ML) algorithms are one of the preferred solutions for real-time internet traffic classification in terms of their easy implementation on hardware. However, the rapid increase in today’s newly developed applications and the resulting diversity in internet traffic greatly increases the size of DTs. Therefore, the tree-based hardware classifiers cannot keep up with this growth in terms of resource usage and classification speed. To alleviate the problem, we propose to group application classes by certain rules and create an individual small DT per each group. In this article, a pipelined organization of multiple DT data structures, called pipelined decision trees, is proposed as a scalable solution to tree-based traffic classification. We also propose two distinct algorithms, namely confusion matrix-based class aggregation and leaf count-based class aggregation algorithms, to set group creation rules that allows traffic classification on pipelined smaller DTs in a hierarchical order. We further designed an hardware engine on field programmable gate arrays, which can search those pipelined trees within a single clock cycle by transforming them into bit vectors and implementing multiple range comparisons in parallel. Our architecture with 12 classes can run in 928.88 giga bit per second and achieve 96.04% accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.