The digital revolution has substantially changed our lives in which Internet-of-Things (IoT) plays a prominent role. The rapid development of IoT to most corners of life, however, leads to various emerging cybersecurity threats. Therefore, detecting and preventing potential attacks in IoT networks have recently attracted paramount interest from both academia and industry. Among various attack detection approaches, machine learning-based methods, especially deep learning, have demonstrated great potential thanks to their early detecting capability. However, these machine learning techniques only work well when a huge volume of data from IoT devices with label information can be collected. Nevertheless, the labeling process is usually time consuming and expensive, thus, it may not be able to adapt with quick evolving IoT attacks in reality. In this paper, we propose a novel deep transfer learning (DTL) method that allows to learn from data collected from multiple IoT devices in which not all of them are labeled. Specifically, we develop a DTL model based on two AutoEncoders (AEs). The first AE (AE 1) is trained on the source datasets (source domains) in the supervised mode using the label information and the second AE (AE 2) is trained on the target datasets (target domains) in an unsupervised manner without label information. The transfer learning process attempts to force the latent representation (the bottleneck layer) of AE 2 similarly to the latent representation of AE 1. After that, the latent representation of AE 2 is used to detect attacks in the incoming samples in the target domain. We carry out intensive experiments on nine recent IoT datasets to evaluate the performance of the proposed model. The experimental results demonstrate that the proposed DTL model significantly improves the accuracy in detecting IoT attacks compared to the baseline deep learning technique and two recent DTL approaches. INDEX TERMS Deep transfer learning, IoT, cyberattack detection, AutoEncoder. I. INTRODUCTION T HE Internet-of-Things (IoT) refers to connected devices, sensors, an actuators used in vehicles, electronic appliances, buildings, and structures. As the sensors, data storage, and the Internet become cheaper, faster, and more integrated together, IoT devices will find more and more applications [1] (e.g., in smart buildings, smart city, intelligent transportation systems, and healthcare). The rapid development of IoT to most corners of life, however, leads to various emerging cybersecurity threats. This is because IoT devices are often limited in computing capability and energy, making them particularly vulnerable to adversaries. IoT devices are more exposed to and unfortunately more difficult to be protected from cyber attacks than computers [2], [3]. Consequently, detecting attacks to protect IoT devices from malicious behaviors is critical to broadening the applications of IoT [4]-[7].
In this paper, we develop a new deep learning approach, Multi-distributed Variational AutoEncoder (MVAE), to enhance network intrusion detection. MVAE introduces label information of data samples into the loss function of VAE. This label information together with reconstruction error function of VAE will force each class of network data into a different region in the latent feature space of MVAE. As a result, the network traffic samples are more distinguishable in the new representation space, thereby improving the accuracy in detecting intrusions for classifiers in the latent feature space of MVAE. To evaluate the efficiency of the proposed solution, we carry out intensive experiments on two popular network intrusion datasets, i.e., NSL-KDD and UNSW-NB15 under four conventional classifiers including Gaussian Naive Bayes (GNB), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF). The experimental results demonstrate that our proposed approach can significantly improve the accuracy of intrusion detection algorithms up to 0.246 compared to the original one.
Internet-of-Things (IoT) has emerged as a cuttingedge technology that is changing human life. The rapid and widespread applications of IoT, however, make cyberspace more vulnerable, especially to IoT-based attacks in which IoT devices are used to launch attack on cyber-physical systems. Given a massive number of IoT devices (in order of billions), detecting and preventing these IoT-based attacks are critical. However, this task is very challenging due to the limited energy and computing capabilities of IoT devices and the continuous and fast evolving of attackers. Among IoT-based attacks, unknown ones are far more devastating as these attacks could surpass most of the current security systems and it takes time to detect them and "cure" the systems. To effectively detect new/unknown attacks, in this paper, we propose a novel representation learning method to better predictively "describe" unknown attacks, facilitating supervised learning-based anomaly detection methods. Specifically, we develop three regularized versions of AutoEncoders (AEs) to learn a latent representation from the input data. The bottleneck layers of these regularized AEs trained in a supervised manner using normal data and known IoT attacks will then be used as the new input features for classification algorithms. We carry out intensive experiments on nine recent IoT datasets to evaluate the performance of the proposed models. The experimental results demonstrate that the new latent representation can significantly enhance the performance of supervised learning methods in detecting unknown IoT attacks. We also conduct experiments to investigate the characteristics of the proposed models and the influence of hyperparameters on its performance. The running time of these models is about 1.3 milliseconds that is pragmatic for most applications.
Machine learning-based intrusion detection hasbecome more popular in the research community thanks to itscapability in discovering unknown attacks. To develop a gooddetection model for an intrusion detection system (IDS) usingmachine learning, a great number of attack and normal datasamples are required in the learning process. While normaldata can be relatively easy to collect, attack data is muchrarer and harder to gather. Subsequently, IDS datasets areoften dominated by normal data and machine learning modelstrained on those imbalanced datasets are ineffective in detect-ing attacks. In this paper, we propose a novel solution to thisproblem by using generative adversarial networks to generatesynthesized attack data for IDS. The synthesized attacks aremerged with the original data to form the augmented dataset.Three popular machine learning techniques are trained on theaugmented dataset. The experiments conducted on the threecommon IDS datasets and one our own dataset show thatmachine learning algorithms achieve better performance whentrained on the augmented dataset of the generative adversarialnetworks compared to those trained on the original datasetand other sampling techniques. The visualization techniquewas also used to analyze the properties of the synthesizeddata of the generative adversarial networks and the others.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.