Cybersecurity is becoming increasingly important with the explosion of attack surfaces as more cyber-physical systems are being deployed. It is impractical to create models with acceptable performance for every single computing infrastructure and the various attack scenarios due to the cost of collecting labeled data and training models. Hence it is important to be able to develop models that can take advantage of knowledge available in an attack source domain to improve performance in a target domain with little domain specific data.In this work we proposed Domain Adaptive Host-based Intrusion Detection DAHID; an approach for detecting attacks in multiple domains for cybersecurity. Specifically, we implemented a deep learning model which utilizes a substantially smaller amount of target domain data for host-based intrusion detection.In our experiments, we used two datasets from Australian Defense Force Academy; ADFA-WD as the source domain and ADFA-WD:SAA as the target domain datasets. We recorded a significant improvement in Area Under Curve AUC from 83% to 91%, when we fine-tuned a deep learning model trained on ADFA-WD with as little as 20% of ADFA-WD:SAA. Our result shows transfer learning can help to alleviate the need of huge domain specific dataset in building host-based intrusion detection models.
Digital transformation has continued to have a remarkable impact on industries, creating new possibilities and improving the performance of existing ones. Recently, we have seen more deployments of cyber-physical systems and the Internet of Things (IoT) as in no other time. However, cybersecurity is often an afterthought in the design and implementation of many systems; therefore, there usually is an introduction of new attack surfaces as new systems and applications are being deployed. Machine learning has been helpful in creating intrusion detection models, but it is impractical to create attack detection models with acceptable performance for every single computing infrastructure and the various attack scenarios due to the cost of collecting quality labeled data and training models. Hence, there is a need to develop models that can take advantage of knowledge available in a high resource source domain to improve performance of a low resource target domain model. In this work, we propose a novel cross-domain deep learning-based approach for attack detection in Host-based Intrusion Detection Systems (HIDS). Specifically, we developed a method for candidate source domain selection from among a group of potential source domains by computing the similarity score a target domain records when paired with a potential source domain. Then, using different word embedding space combination techniques and transfer learning approach, we leverage the knowledge from a well performing source domain model to improve the performance of a similar model in the target domain. To evaluate our proposed approach, we used Leipzig Intrusion Detection Dataset (LID-DS), a HIDS dataset recorded on a modern operating system that consists of different attack scenarios. Our proposed cross-domain approach recorded significant improvement in the target domains when compared with the results from in-domain approach experiments. Based on the result, the F2-score of the target domain CWE-307 improved from 80% in the in-domain approach to 87% in the cross-domain approach while the target domain CVE-2014-0160 improved from 13% to 85%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.