Detection of malicious PDF files and directions for enhancements: A state-of-the art survey

Nissim, Nir; Cohen, Avraham; Glezer, Chanan; Elovici, Yuval

doi:10.1016/j.cose.2014.10.014

Cited by 84 publications

(77 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, static analysis has drawbacks as well, including the inability to detect well obfuscated code that acts maliciously during runtime, in contrast to dynamic analysis that will likely detect that code. Here we present the most relevant studies, however a more comprehensive analysis of related work can be found in our recent survey study [4].…”

Section: Related Workmentioning

confidence: 99%

“…Available filters and their primary purposes are discussed by P. Baccas and J. Kittilsen [10], [11]. Table 1 summarizes various code obfuscation techniques employed by attackers [4].…”

Section: Javascript Code Attacksmentioning

confidence: 99%

“…In our recent studies [4], [35] we thoroughly investigated PDF files and found that most malicious PDF files (96.5 %) are not compatible with the PDF file format specifications (checked with the Adobe PDF Reference 23 ). These files cannot be viewed by the PDF reader or the 22 http://www.kaspersky.com.…”

Section: Dataset Collectionmentioning

confidence: 99%

“…The PDF file structure is depicted in Fig. 1 and is comprised of four basic parts according to the Adobe PDF Reference 13 [4], [5], and [6]:…”

Section: Pdf File Structurementioning

confidence: 99%

“…An incident in 2014 aimed at the Israeli ministry of defense (IMOD) provides an example of a new type of targeted cyber-attack involving non-executable files attached to an email. According to media reports, 4 the attackers posed as IMOD representatives and sent email messages with a malicious portable document format (PDF) file attachment which, when opened, installed a Trojan horse enabling the attacker to control the computer.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework

et al. 2016

Self Cite

View full text Add to dashboard Cite

Attackers increasingly take advantage of naive users who tend to treat non-executable files casually, as if they are benign. Such users often open non-executable files although they can conceal and perform malicious operations. Existing defensive solutions currently used by organizations prevent executable files from entering organizational networks via web browsers or email messages. Therefore, recent advanced persistent threat attacks tend to leverage non-executable files such as portable document format (PDF) documents which are used daily by organizations. Machine Learning (ML) methods have recently been applied to detect malicious PDF files, however these techniques lack an essential element-they cannot be efficiently updated daily. In this study we present an active learning (AL) based framework, specifically designed to efficiently assist anti-virus vendors focus their analytical efforts aimed at acquiring novel malicious content. This focus is accomplished by identifying and acquiring both new PDF files that are most likely malicious and informative benign PDF documents. These files are used for retraining and enhancing the knowledge stores of both the detection model and anti-virus. We propose two AL based methods: exploitation and combination. Our methods are evaluated and compared to existing AL method (SVM-margin) and to random sampling for 10 days, and results indicate that on the last day of the experiment, combination outperformed all of the other methods, enriching the signature repository of the anti-virus with almost seven times more new malicious PDF files, while each day improving the detection model's capabilities further. At the same time, it dramatically reduces security experts' efforts by 75 %. Despite this significant reduction, results also indicate that our framework better detects new malicious PDF files than leading anti-virus tools commonly used by organizations for protection against malicious PDF files.

show abstract

Section: Related Workmentioning

confidence: 99%

“…Available filters and their primary purposes are discussed by P. Baccas and J. Kittilsen [10], [11]. Table 1 summarizes various code obfuscation techniques employed by attackers [4].…”

Section: Javascript Code Attacksmentioning

confidence: 99%