Kangkook Jee scite author profile

Abstract-The increasingly sophisticated Advanced Persistent Threat (APT) attacks have become a serious challenge for enterprise IT security. Attack causality analysis, which tracks multi-hop causal relationships between files and processes to diagnose attack provenances and consequences, is the first step towards understanding APT attacks and taking appropriate responses. Since attack causality analysis is a time-critical mission, it is essential to design causality tracking systems that extract useful attack information in a timely manner. However, prior work is limited in serving this need. Existing approaches have largely focused on pruning causal dependencies totally irrelevant to the attack, but fail to differentiate and prioritize abnormal events from numerous relevant, yet benign and complicated system operations, resulting in long investigation time and slow responses.To address this problem, we propose PRIOTRACKER, a backward and forward causality tracker that automatically prioritizes the investigation of abnormal causal dependencies in the tracking process. Specifically, to assess the priority of a system event, we consider its rareness and topological features in the causality graph. To distinguish unusual operations from normal system events, we quantify the rareness of each event by developing a reference model which records common routine activities in corporate computer systems. We implement PRIOTRACKER, in 20K lines of Java code, and a reference model builder in 10K lines of Java code. We evaluate our tool by deploying both systems in a real enterprise IT environment, where we collect 1TB of 2.5 billion OS events from 150 machines in one week. Experimental results show that PRIOTRACKER can capture attack traces that are missed by existing trackers and reduce the analysis time by up to two orders of magnitude.

show abstract

NoDoze: Combatting Threat Alert Fatigue with Automated Provenance Triage

Hassan

Guo

et al. 2019

179

120

View full text Add to dashboard Cite

Large enterprises are increasingly relying on threat detection softwares (e.g., Intrusion Detection Systems) to allow them to spot suspicious activities. These softwares generate alerts which must be investigated by cyber analysts to figure out if they are true attacks. Unfortunately, in practice, there are more alerts than cyber analysts can properly investigate. This leads to a "threat alert fatigue" or information overload problem where cyber analysts miss true attack alerts in the noise of false alarms.

show abstract

High Fidelity Data Reduction for Big Data Security Dependency Analyses

Zhang¹,

Wu²,

Li³

et al. 2016

123

View full text Add to dashboard Cite

You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis

Wang¹,

Hassan²,

Ding³

et al. 2020

123

View full text Add to dashboard Cite

To subvert recent advances in perimeter and host security, the attacker community has developed and employed various attack vectors to make a malware much stealthier than before to penetrate the target system and prolong its presence. Such advanced malware or "stealthy malware" makes use of various techniques to impersonate or abuse benign applications and legitimate system tools to minimize its footprints in the target system. It is thus difficult for traditional detection tools, such as malware scanners, to detect it, as the malware normally does not expose its malicious payload in a file and hides its malicious behaviors among the benign behaviors of the processes. In this paper, we present PROVDETECTOR, a provenancebased approach for detecting stealthy malware. Our insight behind the PROVDETECTOR approach is that although a stealthy malware attempts to blend into benign processes, its malicious behaviors inevitably interact with the underlying operating system (OS), which will be exposed to and captured by provenance monitoring. Based on this intuition, PROVDETECTOR first employs a novel selection algorithm to identify possibly malicious parts in the OS-level provenance data of a process. It then applies a neural embedding and machine learning pipeline to automatically detect any behavior that deviates significantly from normal behaviors. We evaluate our approach on a large provenance dataset from an enterprise network and demonstrate that it achieves very high detection performance of stealthy malware (an average F1 score of 0.974). Further, we conduct thorough interpretability studies to understand the internals of the learned machine learning models.

show abstract

libdft

et al. 2012

View full text Add to dashboard Cite

Dynamic data flow tracking (DFT) deals with tagging and tracking data of interest as they propagate during program execution. DFT has been repeatedly implemented by a variety of tools for numerous purposes, including protection from zero-day and cross-site scripting attacks, detection and prevention of information leaks, and for the analysis of legitimate and malicious software. We present libdft, a dynamic DFT framework that unlike previous work is at once fast, reusable, and works with commodity software and hardware. libdft provides an API for building DFT-enabled tools that work on unmodified binaries, running on common operating systems and hardware, thus facilitating research and rapid prototyping. We explore different approaches for implementing the low-level aspects of instruction-level data tracking, introduce a more efficient and 64-bit capable shadow memory, and identify (and avoid) the common pitfalls responsible for the excessive performance overhead of previous studies. We evaluate libdft using real applications with large codebases like the Apache and MySQL servers, and the Firefox web browser. We also use a series of benchmarks and utilities to compare libdft with similar systems. Our results indicate that it performs at least as fast, if not faster, than previous solutions, and to the best of our knowledge, we are the first to evaluate the performance overhead of a fast dynamic DFT implementation in such depth. Finally, libdft is freely available as open source software.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kangkook Jee

Towards a Timely Causality Analysis for Enterprise Security

NoDoze: Combatting Threat Alert Fatigue with Automated Provenance Triage

High Fidelity Data Reduction for Big Data Security Dependency Analyses

You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis

libdft

Contact Info

Product

Resources

About