Since cyber-attacks are ever-increasing in number, intensity, and variety, a strong need for a global, standardized cyber-security knowledge database has emerged as a means to prevent and fight cybercrime. Attempts already exist in this regard. The Common Vulnerabilities and Exposures (CVE) list documents numerous reported software and hardware vulnerabilities, thus building a community-based dictionary of existing threats. The MITRE ATT&CK Framework describes adversary behavior and offers mitigation strategies for each reported attack pattern. While extremely powerful on their own, the tremendous extra benefit gained when linking these tools cannot be overlooked. This paper introduces a dataset of 1813 CVEs annotated with all corresponding MITRE ATT&CK techniques and proposes models to automatically link a CVE to one or more techniques based on the text description from the CVE metadata. We establish a strong baseline that considers classical machine learning models and state-of-the-art pre-trained BERT-based language models while counteracting the highly imbalanced training set with data augmentation strategies based on the TextAttack framework. We obtain promising results, as the best model achieved an F1-score of 47.84%. In addition, we perform a qualitative analysis that uses Lime explanations to point out limitations and potential inconsistencies in CVE descriptions. Our model plays a critical role in finding kill chain scenarios inside complex infrastructures and enables the prioritization of CVE patching by the threat level. We publicly release our code together with the dataset of annotated CVEs.
World wide data infrastructure has increased in dimension and complexity due to consolidation, centralization and virtualization trends during the last 10 years. Being able to discriminate quickly between large-scale non-directional attacks and targeted APT (advanced persistent threats) or between script kiddies and experienced hackers is key for protecting critical IT infrastructures. While the first case can be easily handled by existing solutions, the latter raises significant challenges. Implementing honeytokens and honeypots is an extremely efficient intrusion detection system based on setting traps for hackers by deliberately placing enticing resources within existing environments. Previous research has used honeypots to understand hacking TTPs (tactics, techniques and procedures) and to generate more realistic honeytokens. In this paper we build on existing results to quickly categorize attacks, map the attacker persona and focus on targeted attacks. We influence the execution flow by trapping the attackers into a maze with three purposes. The first aim consists in distracting them from the real data and understanding their motivation; this is done by placing low hanging fruits in his path. The second aim refers to getting to know the attackers, gathering forensic evidence and using this information to adapt incident response. The last goal is the most difficult: to completely remove the threat by revealing the attackers' identity, getting in contact, handing them over to law enforcement agencies, or deterring them. We deploy a series of interconnected honeytokens, working together as a whole. Each honeytoken will have an exploitation difficulty in order to map out the attacker's skills and will lead to the next honeytoken, thus forming a real-world hacking scenario. We are also analysing the possibility of deploying dynamic traps based on how the attack develops in real time. From a technical perspective we propose a zero-touch approach for existing environments, by deploying the honeytokens as a service in the cloud, with minimum overhead for the customer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.