Malware is constantly adapting in order to avoid detection. Model based malware detectors, such as SVM and neural networks, are vulnerable to so-called adversarial examples which are modest changes to detectable malware that allows the resulting malware to evade detection. Continuous-valued methods that are robust to adversarial examples of images have been developed using saddle-point optimization formulations. We are inspired by them to develop similar methods for the discrete, e.g. binary, domain which characterizes the features of malware. A specific extra challenge of malware is that the adversarial examples must be generated in a way that preserves their malicious functionality. We introduce methods capable of generating functionally preserved adversarial malware examples in the binary domain. Using the saddle-point formulation, we incorporate the adversarial examples into the training of models that are robust to them. We evaluate the effectiveness of the methods and others in the literature on a set of Portable Execution (PE) files. Comparison prompts our introduction of an online measure computed during training to assess general expectation of robustness.
Motivated by Danskin's theorem, gradient-based methods have been applied with empirical success to solve minimax problems that involve non-convex outer minimization and non-concave inner maximization. On the other hand, recent work has demonstrated that Evolution Strategies (ES) algorithms are stochastic gradient approximators that seek robust solutions. In this paper, we address black-box (gradient-free) minimax problems that have long been tackled in a coevolutionary setup. To this end and guaranteed by Danskin's theorem, we employ ES as a stochastic estimator for the descent direction. The proposed approach is validated on a collection of black-box minimax problems. Based on our experiments, our method's performance is comparable with its coevolutionary counterparts and favorable for high-dimensional problems. Its efficacy is demonstrated on a real-world application.
With the celebrated success of deep learning, some attempts to develop effective methods for detecting malicious PowerShell programs employ neural nets in a traditional natural language processing setup while others employ convolutional neural nets to detect obfuscated malicious commands at a character level. While these representations may express salient PowerShell properties, our hypothesis is that tools from static program analysis will be more effective. We propose a hybrid approach combining traditional program analysis (in the form of abstract syntax trees) and deep learning. This poster presents preliminary results of a fundamental step in our approach: learning embeddings for nodes of PowerShell ASTs. We classify malicious scripts by family type and explore embedded program vector representations.
CCS CONCEPTS• Security and privacy → Malware and its mitigation; • Computing methodologies → Neural networks; KEYWORDS powershell scripts; malware; deep learning; abstract syntax trees ACM Reference Format:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.