Distributed denial of service (DDoS) is an immense threat for Internet based-applications and their resources. It immediately floods the victim system by transmitting a large number of network packets, and due to this, the victim system resources become unavailable for legitimate users. Therefore, this attack is claimed to be a dangerous attack for Internet-based applications and their resources. Several security approaches have been proposed in the literature to protect Internet-based applications from this type of threat. However, the frequency and strength of DDoS attacks are increasing day-by-day. Further, most of the traditional and distributed processing frameworks-based DDoS attack detection systems analyzed network flows in offline batch processing. Hence, they failed to classify network flows in real-time. This paper proposes a novel Spark Streaming and Kafka-based distributed classification system, named by SSK-DDoS, for classifying different types of DDoS attacks and legitimate network flows. This classification approach is implemented using a distributed Spark MLlib machine learning algorithms on a Hadoop cluster and deployed on the Spark streaming platform to classify streams in real-time. The incoming streams consume by Kafka’s topic to perform preprocessing tasks such as extracting and formulating features for classifying them into seven groups: Benign, DDoS-DNS, DDoS-LDAP, DDoS-MSSQL, DDoS-NetBIOS, DDoS-UDP, and DDoS-SYN. Further, the SSK-DDoS classification system stores formulated features with their predicted class into the HDFS that will help to retrain the distributed classification approach using a new set of samples. The proposed SSK-DDoS classification system has been validated using the recent CICDDoS2019 dataset. The results show that the proposed SSK-DDoS efficiently classified network flows into seven classes and stored formulated features with the predicted value of each incoming network flow into HDFS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.