Improving Probability Estimation Through Active Probabilistic Model Learning

Wang, Jingyi; Chen, Xiaohong; Sun, Jun; Qin, Shengchao

doi:10.1007/978-3-319-68690-5_23

Cited by 5 publications

(4 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Note that the reward is then used as a guide to select seeds and mutation, i.e., those which are predicted to cover the basic blocks with highest rewards. Our approach is inspired by [52,55], which enables us to build a discrete-time Markov Chain (DTMC) abstraction of the program from the collected fuzzing data. Specifically, Definition 3.2.…”

Section: Reward Calculationmentioning

confidence: 99%

Better Pay Attention Whilst Fuzzing

Zhu¹,

Wang²,

Sun³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Fuzzing is one of the prevailing methods for vulnerability detection. However, even state-of-the-art fuzzing methods become ineffective after some period of time, i.e., the coverage hardly improves as existing methods are ineffective to focus the attention of fuzzing on covering the hard-to-trigger program paths. In other words, they cannot generate inputs that can break the bottleneck due to the fundamental difficulty in capturing the complex relations between the test inputs and program coverage. In particular, existing fuzzers suffer from the following main limitations: 1) lacking an overall analysis of the program to identify the most "rewarding" seeds, and 2) lacking an effective mutation strategy which could continuously select and mutates the more relevant "bytes" of the seeds.In this work, we propose an approach called ATTuzz to address these two issues systematically. First, we propose a lightweight dynamic analysis technique which estimates the "reward" of covering each basic block and selects the most rewarding seeds accordingly. Second, we mutate the selected seeds according to a neural network model which predicts whether a certain "rewarding" block will be covered given certain mutation on certain bytes of a seed. The model is a deep learning model equipped with attention mechanism which is learned and updated periodically whilst fuzzing. Our evaluation shows that ATTuzz significantly outperforms 5 stateof-the-art grey-box fuzzers on 13 popular real-world programs at achieving higher edge coverage and finding new bugs. In particular, ATTuzz achieved 2X edge coverage and 4X bugs detected than AFL over 24-hour runs. Moreover, ATTuzz persistently improves the

show abstract

Section: Reward Calculationmentioning

confidence: 99%

Better Pay Attention Whilst Fuzzing

Zhu¹,

Wang²,

Sun³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Existing probabilistic model learning algorithms are often based on algorithms designed for learning deterministic (probabilistic) finite automata, which are investigated and evidenced in many previous works including but not limited to [7,8,10,[18][19][20]25,44,45,52]. It is also related to the work on Markov chain estimation [17,56].…”

Section: Related Workmentioning

confidence: 99%

Learning probabilistic models for model checking: an evolutionary approach and an empirical study

Wang

Sun

Yuan

et al. 2018

Int J Softw Tools Technol Transfer

Self Cite

View full text Add to dashboard Cite

Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt modelbased system analysis and development techniques. To overcome this problem, researchers have proposed to automatically "learn" models based on sample system executions and shown that the learned models can be useful sometimes. There are however many questions to be answered. For instance, how much shall we generalize from the observed samples and how fast would learning converge? Or, would the analysis result based on the learned model be more accurate than the estimation we could have obtained by sampling many system executions within the same amount of time? Moreover, how well does learning scale to real-world applications? If the answer is negative, what are the potential methods to improve the efficiency of learning? In this work, we first investigate existing algorithms for learning probabilistic models for model checking and propose an evolution-based approach for better controlling the degree of generalization. Then, we present existing approaches to learn abstract models to improve the efficiency of learning for scalability reasons. Lastly, we conduct an empirical study in order to answer the above questions. Our findings include that the effectiveness of learning may sometimes be limited and it is worth investigating how abstraction should be done properly in order to learn abstract models.

show abstract

“…This work is inspired by the recent trend on adopting machine learning to automatically learn models for model checking. Various kinds of model learning algorithms have been investigated including continuous-time Markov Chain [25], DTMC [19,6,33,31,34] and Markov Decision Process [18,3]. In particular, this case study is closely related to the learning approach called LAR documented in [32], which combines model learning and abstraction refinement to automatically find a proper level of abstraction to treat the problem of real-typed variables.…”

Section: Conclusion and Related Workmentioning

confidence: 99%

Towards ‘Verifying’ a Water Treatment System

et al. 2018

Self Cite

View full text Add to dashboard Cite

Modeling and verifying real-world cyber-physical systems is challenging, which is especially so for complex systems where manually modeling is infeasible. In this work, we report our experience on combining model learning and abstraction refinement to analyze a challenging system, i.e., a real-world Secure Water Treatment system (SWaT). Given a set of safety requirements, the objective is to either show that the system is safe with a high probability (so that a system shutdown is rarely triggered due to safety violation) or not. As the system is too complicated to be manually modeled, we apply latest automatic model learning techniques to construct a set of Markov chains through abstraction and refinement, based on two long system execution logs (one for training and the other for testing). For each probabilistic safety property, we either report it does not hold with a certain level of probabilistic confidence, or report that it holds by showing the evidence in the form of an abstract Markov chain. The Markov chains can subsequently be implemented as runtime monitors in SWaT.

show abstract

Improving Probability Estimation Through Active Probabilistic Model Learning

Cited by 5 publications

References 12 publications

Better Pay Attention Whilst Fuzzing

Better Pay Attention Whilst Fuzzing

Learning probabilistic models for model checking: an evolutionary approach and an empirical study

Towards ‘Verifying’ a Water Treatment System

Contact Info

Product

Resources

About