A Feature-Oriented Corpus for Understanding, Evaluating and Improving Fuzz Testing

Zhu, Xiaogang; Feng, Xiaotao; Jiao, Tengyun; Wen, Sheng; Xiang, Yang; Camtepe, Seyit; Xue, Jingling

doi:10.1145/3321705.3329845

Cited by 10 publications

(5 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…First, some fuzzers may be difficult or complicated to be used directly. For instance, Zhu et al [15] stated that they could not appropriately run Driller [17], T-Fuzz [7] and VUzzer [8]. Second, we find that there are numerous flaws (e.g., incorrect judgment on crash, abnormal behaviors during the fuzzing process) with the implementation of many fuzzers, which may cause negative impacts on their performance.…”

Section: Motivation Of Unifuzzmentioning

confidence: 74%

“…Conducting comprehensive and pragmatic evaluations of fuzzers entails overcoming multiple important challenges. First, although many fuzzers have been open sourced, their usability in practice is often limited, as reported by recent research [7,15], which results in reproducibility issues, impeding comparison. Thus, it is necessary to test and enhance fuzzers' usability.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers

Li¹,

Ji²,

Chen³

et al. 2020

Preprint

View full text Add to dashboard Cite

A flurry of fuzzing tools (fuzzers) have been proposed in the literature, aiming at detecting software vulnerabilities effectively and efficiently. To date, it is however still challenging to compare fuzzers due to the inconsistency of the benchmarks, performance metrics, and/or environments for evaluation, which buries the useful insights and thus impedes the discovery of promising fuzzing primitives. In this paper, we design and develop UNIFUZZ, an open-source and metrics-driven platform for assessing fuzzers in a comprehensive and quantitative manner. Specifically, UNIFUZZ to date has incorporated 35 usable fuzzers, a benchmark of 20 real-world programs, and six categories of performance metrics. We first systematically study the usability of existing fuzzers, find and fix a number of flaws, and integrate them into UNIFUZZ. Based on the study, we propose a collection of pragmatic performance metrics to evaluate fuzzers from six complementary perspectives. Using UNIFUZZ, we conduct in-depth evaluations of several prominent fuzzers including AFL VUzzer64 [8]. We find that none of them outperforms the others across all the target programs, and that using a single metric to assess the performance of a fuzzer may lead to unilateral conclusions, which demonstrates the significance of comprehensive metrics. Moreover, we identify and investigate previously overlooked factors that may significantly affect a fuzzer's performance, including instrumentation methods and crash analysis tools. Our empirical results show that they are critical to the evaluation of a fuzzer. We hope that our findings can shed light on reliable fuzzing evaluation, so that we can discover promising fuzzing primitives to effectively facilitate fuzzer designs in the future.Yuwei Li and Shouling Ji are the co-first authors. Shouling Ji and Chunming Wu are the co-corresponding authors.

show abstract

Section: Motivation Of Unifuzzmentioning

confidence: 74%

Section: Introductionmentioning

confidence: 99%

UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers

Li¹,

Ji²,

Chen³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…We integrated GSPR into two popular coverage-guided fuzzers, AFL and AFLFast, which we chose because they are representative of coverage-guided fuzzers. They are frequently adopted by other works [4] [9][10][11][12]. We evaluated 7 real open-source applications.…”

Section: Gspr and Repetition Ratementioning

confidence: 99%

BitAFL: Provide More Accurate Coverage Information for Coverage-guided Fuzzing

Xu,

Yang,

Chen

et al. 2023

Atlantis Highlights in Engineering

View full text Add to dashboard Cite

CGF (Coverage-guided fuzzing) has found a large number of software vulnerabilities with its low cost and adaptability. CGF mutates at the bit or byte level, so most of the mutated test cases cover the same paths. But no previous work had quantified the percentage of test cases that covered the duplicate paths. Therefore, we designed the experimental framework GSPR (get same path rate) based on AFL. We fuzzed seven applications using GSPR and found that approximately 70% of the test cases covered duplicate paths. Based on the above experimental results, we solve the hash collision issue in AFL. We analyzed the various situations that cause hash collision, and introduced the concepts of local collision and global collision. Because a large number of test cases cover duplicate paths, there are much repeated global collision. Based on these findings, we propose different solutions to hash collision according to the size of target program. We extended AFL to implement BitAFL and evaluated it on seven applications. In a comparison experiment with AFL, the results show that our method can completely eliminate hash collisions in small programs. In large programs, BitAFL is able to reduce collisions by more than 80%. In addition, on average, BitAFL found 8.87% more paths than AFL. In summary, our approach provides AFL with more accurate coverage information and can find more paths.

show abstract

“…The evaluation of fuzzing is usually conducted separately from the detection stage. However, we consider the evaluation as a part of the fuzzing processes because a proper evaluation can help improve the performance of fuzzing [215]. A proper evaluation includes efective experimental corpus [215], fair evaluation environment/platform [30,104,126], reasonable fuzzing time [17,20], and comprehensive comparison metrics [96,104].…”

Section: Evaluation Theorymentioning

confidence: 99%

“…However, we consider the evaluation as a part of the fuzzing processes because a proper evaluation can help improve the performance of fuzzing [215]. A proper evaluation includes efective experimental corpus [215], fair evaluation environment/platform [30,104,126], reasonable fuzzing time [17,20], and comprehensive comparison metrics [96,104]. Although these researches have made eforts on proper evaluations, it is still an open question about how to evaluate techniques (i.e., the fuzzing algorithms) instead of implementations (i.e., the code that implements the algorithms) [18].…”

Section: Evaluation Theorymentioning

confidence: 99%

Fuzzing: A Survey for Roadmap

et al. 2022

Self Cite

View full text Add to dashboard Cite

Fuzz testing (fuzzing) has witnessed its prosperity in detecting security flaws recently. It generates a large number of test cases and monitors the executions for defects. Fuzzing has detected thousands of bugs and vulnerabilities in various applications. Although effective, there lacks systematic analysis of gaps faced by fuzzing. As a technique of defect detection, fuzzing is required to narrow down the gaps between the entire input space and the defect space. Without limitation on the generated inputs, the input space is infinite. However, defects are sparse in an application, which indicates that the defect space is much smaller than the entire input space. Besides, because fuzzing generates numerous test cases to repeatedly examine targets, it requires fuzzing to perform in an automatic manner. Due to the complexity of applications and defects, it is challenging to automatize the execution of diverse applications. In this paper, we systematically review and analyze the gaps as well as their solutions, considering both breadth and depth. This survey can be a roadmap for both beginners and advanced developers to better understand fuzzing.

show abstract

A Feature-Oriented Corpus for Understanding, Evaluating and Improving Fuzz Testing

Cited by 10 publications

References 16 publications

UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers

UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers

BitAFL: Provide More Accurate Coverage Information for Coverage-guided Fuzzing

Fuzzing: A Survey for Roadmap

Contact Info

Product

Resources

About