2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE) 2017
DOI: 10.1109/ase.2017.8115618
|View full text |Cite
|
Sign up to set email alerts
|

Learn&Fuzz: Machine learning for input fuzzing

Abstract: Fuzzing consists of repeatedly testing an application with modified, or fuzzed, inputs with the goal of finding security vulnerabilities in input-parsing code. In this paper, we show how to automate the generation of an input grammar suitable for input fuzzing using sample inputs and neural-network-based statistical machine-learning techniques. We present a detailed case study with a complex input format, namely PDF, and a large complex security-critical parser for this format, namely, the PDF parser embedded … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
236
1
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 287 publications
(238 citation statements)
references
References 25 publications
0
236
1
1
Order By: Relevance
“…Table I shows that the new seed corpora generated by our framework caused up to 2.48% more basic blocks and 24.30% more execution paths being covered than the original seed corpus. Our results significantly surpassed similar works such as [1], which generated seed corpora by learning the grammar of the PDF files and the new corpora covered 0.11% more instructions. We next evaluated our framework by fuzzing MuPDF and three other PDF viewers (pdfium, podofo, and poppler) with the original and generated corpus for 24 hours.…”
Section: Evaluationscontrasting
confidence: 51%
See 1 more Smart Citation
“…Table I shows that the new seed corpora generated by our framework caused up to 2.48% more basic blocks and 24.30% more execution paths being covered than the original seed corpus. Our results significantly surpassed similar works such as [1], which generated seed corpora by learning the grammar of the PDF files and the new corpora covered 0.11% more instructions. We next evaluated our framework by fuzzing MuPDF and three other PDF viewers (pdfium, podofo, and poppler) with the original and generated corpus for 24 hours.…”
Section: Evaluationscontrasting
confidence: 51%
“…Most existing fuzzing tools, or fuzzers, generate excessive test inputs by mutating a pre-selected corpus of seed inputs with the hope to reveal potential bugs in the target program. Therefore, extensive research effort has been dedicated to improving the quality of seed corpora [1]. Existing approaches in this direction, however, share a common limitation that they focus on discovering syntactic or semantic constraints posed by the target program for inputs in order to generate valid seed inputs.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, Superion may have trouble finding proprietary grammars or undocumented extensions to standard grammars. However, several automatic grammar inference techniques [7,29,34,63] have been proposed, we plan to integrate such techniques to have a wider applicability.…”
Section: H Discussionmentioning
confidence: 99%
“…There has also been some recent interest in automatically generating input grammars from existing inputs, using machine learning [41] and language inference algorithms [22]. Similarly, DI-FUZE [33] infers device driver interfaces from a running kernel to boostrap subsequent structured fuzzing.…”
Section: Related Workmentioning
confidence: 99%