Feature extraction for machine learning-based intrusion detection in IoT networks

Sarhan, Mohanad; Layeghy, Siamak; Moustafa, Nour; Gallagher, Marcus; Portmann, Marius

doi:10.1016/j.dcan.2022.08.012

Cited by 67 publications

(27 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The PCA_FE of KNN and SVM models increases until dimension 15 with almost 95% for ACC and AUC and 91% in DR for KNN, while SVM gets 93% in ACC, 79%in DR, and 91% in AUC. DT, RF, and LR models require only ten dimensions for PCA_FE [25]. The GIWRF_FE model performs better with fewer dimensions, reducing the PCA_FE of the KNN, DT, and RF models.…”

Section: Experimental Results and Findingsmentioning

confidence: 99%

Performance Analysis of Intrusion Detection System in the IoT Environment Using Feature Selection Technique

Alhanaya¹,

Al-Shqeerat²

2023

Intelligent Automation &Amp; Soft Computing

View full text Add to dashboard Cite

The increasing number of security holes in the Internet of Things (IoT) networks creates a question about the reliability of existing network intrusion detection systems. This problem has led to the developing of a research area focused on improving network-based intrusion detection system (NIDS) technologies. According to the analysis of different businesses, most researchers focus on improving the classification results of NIDS datasets by combining machine learning and feature reduction techniques. However, these techniques are not suitable for every type of network. In light of this, whether the optimal algorithm and feature reduction techniques can be generalized across various datasets for IoT networks remains. The paper aims to analyze the methods used in this research and whether they can be generalized to other datasets. Six ML models were used in this study, namely, logistic regression (LR), decision trees (DT), Naive Bayes (NB), random forest (RF), K-nearest neighbors (KNN), and linear SVM. The primary detection algorithms used in this study, Principal Component (PCA) and Gini Impurity-Based Weighted Forest (GIWRF) evaluated against three global ToN-IoT datasets, UNSW-NB15, and Bot-IoT datasets. The optimal number of dimensions for each dataset was not studied by applying the PCA algorithm. It is stated in the paper that the selection of datasets affects the performance of the FE techniques and detection algorithms used. Increasing the efficiency of this research area requires a comprehensive standard feature set that can be used to improve quality over time.

show abstract

Section: Experimental Results and Findingsmentioning

confidence: 99%

Performance Analysis of Intrusion Detection System in the IoT Environment Using Feature Selection Technique

Alhanaya¹,

Al-Shqeerat²

2023

Intelligent Automation &Amp; Soft Computing

View full text Add to dashboard Cite

show abstract

“…The IoT nodes (for instance, green gas IoT and industrial IoT actuators) communicate using MQTT, and they publish and subscribe to different topics, namely temperature and humidity. Sarhan et al [31] intended to standardise the techniques to apply them to any dataset. Six ML models, deep feed forward (DFF), CNN, recurrent neural network (RNN), DT, LR, and NB, and three feature extraction algorithms, principal component analysis (PCA), linear discriminant analysis (LDA), and automatic encoder, were applied on three reference datasets, and among them was the ToN-IoT [30].…”

Section: Related Workmentioning

confidence: 99%

“…Several normal and cyber attack events from IoT networks DFF, CNN, RNN, DT, LR, NB [30]. Autoencoder [31] MedBIoT [32] IoT network (i.e., fans, locks, light bulbs and switches). Mirai, BashLite, Torii KNN, SVM, DT, RF [32].…”

Section: Doshi Et Al [14]mentioning

confidence: 99%

Application of Machine Learning Algorithms for the Validation of a New CoAP-IoT Anomaly Detection Dataset

Vigoya¹,

Pardal²,

Fernández³

et al. 2023

Applied Sciences

View full text Add to dashboard Cite

With the rise in smart devices, the Internet of Things (IoT) has been established as one of the preferred emerging platforms to fulfil their need for simple interconnections. The use of specific protocols such as constrained application protocol (CoAP) has demonstrated improvements in the performance of the networks. However, power-, bandwidth-, and memory-constrained sensing devices constitute a weakness in the security of the system. One way to mitigate these security problems is through anomaly-based intrusion detection systems, which aim to estimate the behaviour of the systems based on their “normal” nature. Thus, to develop anomaly-based intrusion detection systems, it is necessary to have a suitable dataset that allows for their analysis. Due to the lack of a public dataset in the CoAP-IoT environment, this work aims to present a complete and labelled CoAP-IoT anomaly detection dataset (CIDAD) based on real-world traffic, with a sufficient trace size and diverse anomalous scenarios. The modelled data were implemented in a virtual sensor environment, including three types of anomalies in the CoAP data. The validation of the dataset was carried out using five shallow machine learning techniques: logistic regression, naive Bayes, random forest, AdaBoost, and support vector machine. Detailed analyses of the dataset, data conditioning, feature engineering, and hyperparameter tuning are presented. The evaluation metrics used in the performance comparison are accuracy, precision, recall, F1 score, and kappa score. The system achieved 99.9% accuracy for decision tree models. Random forest established itself as the best model, obtaining a 99.9% precision and F1 score, 100% recall, and a Cohen’s kappa statistic of 0.99.

show abstract

“…Data processing plays a vital role in an IDS and essential first step in enhancing the training process for the machine learning models [29]. It can be used to derive data preprocessing, which has a direct impact on how well a model performs in terms of classification.…”

Section: Data Preprocessingmentioning

confidence: 99%

“…-Most metrics are affected by the imbalance of classes in the datasets. Therefore, a single metric cannot be used to differentiate between models [29]. Thus, The ROC curves plotting both the DR and FAR for distinguishing between attack and benign on the x-and y-axes respectively.…”

Section: Evalution Modelmentioning

confidence: 99%

Machine learning to improve the performance of anomaly-based network intrusion detection in big data

Chimphlee

2023

IJEECS

View full text Add to dashboard Cite

With the rapid growth of digital technology communications are overwhelmed by network data traffic. The demand for the internet is growing every day in today's cyber world, raising concerns about network security. Big Data are a term that describes a vast volume of complicated data that is critical for evaluating network patterns and determining what has occurred in the network. Therefore, detecting attacks in a large network is challenging. Intrusion detection system (IDS) is a promising cybersecurity research field. In this paper, we proposed an efficient classification scheme for IDS, which is divided into two procedures, on the CSE-CIC-IDS-2018 dataset, data pre-processing techniques including under-sampling, feature selection, and classifier algorithms were used to assess and decide the best performing model to classify invaders. We have implemented and compared seven classifier machine learning algorithms with various criteria. This work explored the application of the random forest (RF) for feature selection in conjunction with machine learning (ML) techniques including linear regression (LR), k-Nearest Neighbor (k-NN), classification and regression trees (CART), Bayes, RF, multi layer perceptron (MLP), and XGBoost in order to implement IDSS. The experimental results show that the MLP algorithm in the most successful with best performance with evaluation matrix.

show abstract

Feature extraction for machine learning-based intrusion detection in IoT networks

Cited by 67 publications

References 14 publications

Performance Analysis of Intrusion Detection System in the IoT Environment Using Feature Selection Technique

Performance Analysis of Intrusion Detection System in the IoT Environment Using Feature Selection Technique

Application of Machine Learning Algorithms for the Validation of a New CoAP-IoT Anomaly Detection Dataset

Machine learning to improve the performance of anomaly-based network intrusion detection in big data

Contact Info

Product

Resources

About