Evaluating Fuzz Testing

Klees, George; Ruef, Andrew; Cooper, Benji; Wei, Shiyi; Hicks, Michael

doi:10.1145/3243734.3243804

Cited by 452 publications

(328 citation statements)

References 45 publications

Supporting

Mentioning

322

Contrasting

Unclassified

Order By: Relevance

“…), we are unable to identify which benchmark characteristics are optimal for AFL-Dyninst's performance. Across all benchmarks, UnTracer achieves an aver- Mann Whitney U-test scoring: Following Klees et al's [59] recommendation, we utilize the Mann Whitney Utest to determine if UnTracer's execution overhead is stochastically smaller than AFL-QEMU's and AFL-Dyninst's. First we compute all per-dataset execution times for each benchmark 6 and tracer combination; then for each benchmark dataset we apply the Mann Whitney U-test with 0.05 significance level on execution times of UnTracer versus AFL-QEMU and UnTracer versus AFL-Dyninst.…”

Section: E Untracer Versus Coverage-agnostic Tracingmentioning

confidence: 99%

Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing

Nagy

Hicks

2019

2019 IEEE Symposium on Security and Privacy (SP)

102

View full text Add to dashboard Cite

Coverage-guided fuzzing is one of the most successful approaches for discovering software bugs and security vulnerabilities. Of its three main components: (1) test case generation, (2) code coverage tracing, and (3) crash triage, code coverage tracing is a dominant source of overhead. Coverageguided fuzzers trace every test case's code coverage through either static or dynamic binary instrumentation, or more recently, using hardware support. Unfortunately, tracing all test cases incurs significant performance penalties-even when the overwhelming majority of test cases and their coverage information are discarded because they do not increase code coverage.To eliminate needless tracing by coverage-guided fuzzers, we introduce the notion of coverage-guided tracing. Coverageguided tracing leverages two observations: (1) only a fraction of generated test cases increase coverage, and thus require tracing; and (2) coverage-increasing test cases become less frequent over time. Coverage-guided tracing encodes the current frontier of coverage in the target binary so that it self-reports when a test case produces new coverage-without tracing. This acts as a filter for tracing; restricting the expense of tracing to only coverage-increasing test cases. Thus, coverage-guided tracing trades increased time handling coverage-increasing test cases for decreased time handling non-coverage-increasing test cases.To show the potential of coverage-guided tracing, we create an implementation based on the static binary instrumentor Dyninst called UnTracer. We evaluate UnTracer using eight real-world binaries commonly used by the fuzzing community. Experiments show that after only an hour of fuzzing, UnTracer's average overhead is below 1%, and after 24-hours of fuzzing, UnTracer approaches 0% overhead, while tracing every test case with popular white-and black-box-binary tracers AFL-Clang, AFL-QEMU, and AFL-Dyninst incurs overheads of 36%, 612%, and 518%, respectively. We further integrate UnTracer with the stateof-the-art hybrid fuzzer QSYM and show that in 24-hours of fuzzing, QSYM-UnTracer executes 79% and 616% more test cases than QSYM-Clang and QSYM-QEMU, respectively.

show abstract

Section: E Untracer Versus Coverage-agnostic Tracingmentioning

confidence: 99%

Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing

Nagy

Hicks

2019

2019 IEEE Symposium on Security and Privacy (SP)

102

View full text Add to dashboard Cite

show abstract

“…In particular, we define the performance of micro-fuzzing as the number of AC vulnerabilities detected in a test artifact over time, and consider one strategy to outperform another if the strategy detects more AC vulnerabilities given the same time budget. In accordance with recently proposed guidelines for evaluating new fuzz testing techniques [33], we evaluate our proposed seed selection strategy (SRI) by comparing the performance of micro-fuzzing with SRI-based seeds to microfuzzing with "empty seed values" (IVI-based seeds).…”

Section: Discussionmentioning

confidence: 99%

“…We define two such procedures that we describe below: Identity Value Instantiation (IVI), and Small Recursive Instantiation (SRI). a) Identity Value Instantiation: Recent work has proposed guidelines for evaluating new fuzz testing techniques [33]. One of these guidelines is to compare any proposed strategy for constructing seed inputs for fuzz testing with "empty" seed inputs.…”

Section: ) Resource Consumption Optimizationmentioning

confidence: 99%

“…In contrast, Small Recursive Instantiation (SRI) assigns parameters small values chosen at random from the parameter's domain. We use IVI for the sole purpose of providing a baseline for measuring the effectiveness of using SRI to generate seed inputs for micro-fuzzing, based on recent recommendations for evaluating new fuzz testing techniques [33].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

HotFuzz: Discovering Algorithmic Denial-of-Service Vulnerabilities Through Guided Micro-Fuzzing

Blair¹,

Mambretti²,

Arshad³

et al. 2020

Proceedings 2020 Network and Distributed System Security Symposium

View full text Add to dashboard Cite

Fifteen billion devices run Java and many of them are connected to the Internet. As this ecosystem continues to grow, it remains an important task to discover any unknown security threats these devices face. Fuzz testing repeatedly runs software on random inputs in order to trigger unexpected program behaviors, such as crashes or timeouts, and has historically revealed serious security vulnerabilities. Contemporary fuzz testing techniques focus on identifying memory corruption vulnerabilities that allow adversaries to achieve either remote code execution or information disclosure. Meanwhile, Algorithmic Complexity (AC) vulnerabilities, which are a common attack vector for denial-ofservice attacks, remain an understudied threat.In this paper, we present HotFuzz, a framework for automatically discovering AC vulnerabilities in Java libraries. HotFuzz uses micro-fuzzing, a genetic algorithm that evolves arbitrary Java objects in order to trigger the worst-case performance for a method under test. We define Small Recursive Instantiation (SRI) as a technique to derive seed inputs represented as Java objects to micro-fuzzing. After micro-fuzzing, HotFuzz synthesizes test cases that triggered AC vulnerabilities into Java programs and monitors their execution in order to reproduce vulnerabilities outside the fuzzing framework. HotFuzz outputs those programs that exhibit high CPU utilization as witnesses for AC vulnerabilities in a Java library.We evaluate HotFuzz over the Java Runtime Environment (JRE), the 100 most popular Java libraries on Maven, and challenges contained in the DARPA Space and Time Analysis for Cybersecurity (STAC) program. We evaluate SRI's effectiveness by comparing the performance of micro-fuzzing with SRI, measured by the number of AC vulnerabilities detected, to simply using empty values as seed inputs. In this evaluation, we verified known AC vulnerabilities, discovered previously unknown AC vulnerabilities that we responsibly reported to vendors, and received confirmation from both IBM and Oracle. Our results demonstrate that micro-fuzzing finds AC vulnerabilities in realworld software, and that micro-fuzzing with SRI-derived seed inputs outperforms using empty values.

show abstract

“…This set of experiments aims to illustrate the efficiency improvement of PTrix on executing the same amount of inputs. Second, following the best practise [20], we evaluated PTrix on efficiency of code coverage, which is a widely accepted utility metric of fuzzers [23,24]. Recall that PTrix uses feedback that has higher path-sensitivity than QEMU-AFL.…”

Section: Discussionmentioning

confidence: 99%

PTrix

Chen

et al. 2019

Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security

View full text Add to dashboard Cite

Despite its effectiveness in uncovering software defects, American Fuzzy Lop (AFL), one of the best grey-box fuzzers, is inefficient when fuzz-testing source-unavailable programs. AFL's binary-only fuzzing mode, QEMU-AFL, is typically 2-5× slower than its sourceavailable fuzzing mode. The slowdown is largely caused by the heavy dynamic instrumentation.Recent fuzzing techniques use Intel Processor Tracing (PT), a light-weight tracing feature supported by recent Intel CPUs, to remove the need of dynamic instrumentation. However, we found that these PT-based fuzzing techniques are even slower than QEMU-AFL when fuzzing real-world programs, making them less effective than QEMU-AFL. This poor performance is caused by the slow extraction of code coverage information from highly compressed PT traces.In this work, we present the design and implementation of PTrix, which fully unleashes the benefits of PT for fuzzing via three novel techniques. First, PTrix introduces a scheme to highly parallel the processing of PT trace and target program execution. Second, it directly takes decoded PT trace as feedback for fuzzing, avoiding the expensive reconstruction of code coverage information. Third, PTrix maintains the new feedback with stronger feedback than edge-based code coverage, which helps reach new code space and defects that AFL may not.We evaluated PTrix by comparing its performance with the stateof-the-art fuzzers. Our results show that, given the same amount of time, PTrix achieves a significantly higher fuzzing speed and reaches into code regions missed by the other fuzzers. In addition, PTrix identifies 35 new vulnerabilities in a set of previously wellfuzzed binaries, showing its ability to complement existing fuzzers. CCS CONCEPTS• Security and privacy → Software security engineering.

show abstract

Evaluating Fuzz Testing

Cited by 452 publications

References 45 publications

Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing

Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing

HotFuzz: Discovering Algorithmic Denial-of-Service Vulnerabilities Through Guided Micro-Fuzzing

PTrix

Contact Info

Product

Resources

About