In this work, the problem of anomaly detection in imbalanced datasets, framed in the context of network intrusion detection is studied. A novel anomaly detection solution that takes both data‐level and algorithm‐level approaches into account to cope with the class‐imbalance problem is proposed. This solution integrates the auto‐learning ability of Reinforcement Learning with the oversampling ability of a Conditional Generative Adversarial Network (CGAN). To further investigate the potential of a CGAN, in imbalanced classification tasks, the effect of CGAN‐based oversampling on the following classifiers is examined: Naïve Bayes, Multilayer Perceptron, Random Forest and Logistic Regression. Through the experimental results, the authors demonstrate improved performance from the proposed approach, and from CGAN‐based oversampling in general, over other oversampling techniques such as Synthetic Minority Oversampling Technique and Adaptive Synthetic.
Cybersecurity has become a significant issue. Machine learning algorithms are known to help identify cyberattacks such as network intrusion. However, common network intrusion datasets are negatively affected by class imbalance: the normal traffic behaviour constitutes most of the dataset, whereas intrusion traffic behaviour forms a significantly smaller portion. A comparative evaluation of the performance is conducted of several classical machine learning algorithms, as well as deep learning algorithms, on the wellknown National Security Lab Knowledge Discovery and Data Mining dataset for intrusion detection. More specifically, two variants of a fully connected neural network, one with an autoencoder and one without, have been implemented to compare their performance against seven classical machine learning algorithms. A voting classifier is also proposed to combine the decisions of these nine machine learning algorithms. All of the models are tested in combination with three different resampling techniques: oversampling, undersampling, and hybrid sampling. The details of the experiments conducted and an analysis of their results are then discussed.This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.