Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security 2018
DOI: 10.1145/3243734.3243804
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Fuzz Testing

Abstract: Fuzz testing has enjoyed great success at discovering security critical bugs in real software. Recently, researchers have devoted significant effort to devising new fuzzing techniques, strategies, and algorithms. Such new ideas are primarily evaluated experimentally so an important question is: What experimental setup is needed to produce trustworthy results? We surveyed the recent research literature and assessed the experimental evaluations carried out by 32 fuzzing papers. We found problems in every evaluat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
322
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 452 publications
(328 citation statements)
references
References 45 publications
5
322
0
1
Order By: Relevance
“…), we are unable to identify which benchmark characteristics are optimal for AFL-Dyninst's performance. Across all benchmarks, UnTracer achieves an aver- Mann Whitney U-test scoring: Following Klees et al's [59] recommendation, we utilize the Mann Whitney Utest to determine if UnTracer's execution overhead is stochastically smaller than AFL-QEMU's and AFL-Dyninst's. First we compute all per-dataset execution times for each benchmark 6 and tracer combination; then for each benchmark dataset we apply the Mann Whitney U-test with 0.05 significance level on execution times of UnTracer versus AFL-QEMU and UnTracer versus AFL-Dyninst.…”
Section: E Untracer Versus Coverage-agnostic Tracingmentioning
confidence: 99%
“…), we are unable to identify which benchmark characteristics are optimal for AFL-Dyninst's performance. Across all benchmarks, UnTracer achieves an aver- Mann Whitney U-test scoring: Following Klees et al's [59] recommendation, we utilize the Mann Whitney Utest to determine if UnTracer's execution overhead is stochastically smaller than AFL-QEMU's and AFL-Dyninst's. First we compute all per-dataset execution times for each benchmark 6 and tracer combination; then for each benchmark dataset we apply the Mann Whitney U-test with 0.05 significance level on execution times of UnTracer versus AFL-QEMU and UnTracer versus AFL-Dyninst.…”
Section: E Untracer Versus Coverage-agnostic Tracingmentioning
confidence: 99%
“…In particular, we define the performance of micro-fuzzing as the number of AC vulnerabilities detected in a test artifact over time, and consider one strategy to outperform another if the strategy detects more AC vulnerabilities given the same time budget. In accordance with recently proposed guidelines for evaluating new fuzz testing techniques [33], we evaluate our proposed seed selection strategy (SRI) by comparing the performance of micro-fuzzing with SRI-based seeds to microfuzzing with "empty seed values" (IVI-based seeds).…”
Section: Discussionmentioning
confidence: 99%
“…We define two such procedures that we describe below: Identity Value Instantiation (IVI), and Small Recursive Instantiation (SRI). a) Identity Value Instantiation: Recent work has proposed guidelines for evaluating new fuzz testing techniques [33]. One of these guidelines is to compare any proposed strategy for constructing seed inputs for fuzz testing with "empty" seed inputs.…”
Section: ) Resource Consumption Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…This set of experiments aims to illustrate the efficiency improvement of PTrix on executing the same amount of inputs. Second, following the best practise [20], we evaluated PTrix on efficiency of code coverage, which is a widely accepted utility metric of fuzzers [23,24]. Recall that PTrix uses feedback that has higher path-sensitivity than QEMU-AFL.…”
Section: Discussionmentioning
confidence: 99%