Today, data is produced, transmitted and stored across the internet in abundant quantities. In such a world, modern-day network security systems have had to develop at an equally astounding rate in order to keep up with the data deluge. Subsequently, companies spend a substantial amount of money, time and effort in developing intrusion detection systems to ensure timely detection and prevention of malicious activity in order to preserve system security. In our paper, we propose a two-stage algorithm to solve the problem of NIDS resulting in fewer false positives and false negatives. We built and evaluated the performance of our IDS using the benchmarked NSL-KDD dataset, which consists of 41 features, of which 3 are categorical and the remaining numeric. Categorical features have been transformed via One Hot Encoding due to which the feature space explodes to 122 features. Our aim henceforth is to reduce this feature space through methods of feature selection and dimensionality reduction to develop a computationally inexpensive classifier capable of operating on the reduced feature space using sequential models. The work has involved using autoencoders for feature space reduction followed by a feed forward network, and has delivered encouraging results. We have then extended our analysis to identify features which can be eliminated without any substantial loss of information available for the classification algorithm. The remaining set of features can then be input into a different model to possibly provide better results or reduce training and evaluation time.
With an exponential increase in the amount of data produced, transmitted, stored and exchanged over the internet, intrusion detection systems have formed an integral part of modern-day network security systems. Considerable expense, time and efforts are spent in ensuring timely detection and denial of malicious users in order to preserve the key objectives of system security; confidentiality, integrity, and availability. In this paper, we intend to propose a dual stage algorithm to tackle the problem of NIDS. Our aim in this paper is to construct an algorithm that results in few false positives and fewer false negatives, as any IDS should be. Research into network intrusion detection systems dates back to the early 1990s where researchers initially developed rule-based algorithms such as SNORT and TCPDUMP. As the subject gained traction and importance, researchers began to shift efforts towards creating anomaly detection systems using benchmarked datasets to test their algorithms. In the early 2010s, several papers pertaining to NIDS were published, and a majority of the technological breakthroughs in this field were influenced by the theory of deep learning. We have focused on building an IDS using the benchmarked NSL-KDD dataset. The dataset consists of 41 features, of which 3 are categorical and the remaining, numeric. Having opted for One-Hot Encoding of the categorical features, our feature space explodes to a sum of 122 features. Despite the obvious drawbacks, this was necessary as the categorical features do not contain any implicit ordering within their values. The aim is to develop an efficiently compressed feature space with the ultimate goal being to develop a computationally light classification model capable of operating on the compressed feature set using sequential models to our benefit.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.