DeepMutation: Mutation Testing of Deep Learning Systems

Ма, Лей; Zhang, Fuyuan; Sun, Jiyuan; Xue, Minhui; Li, Bo; Juefei-Xu, Felix; Xie, Chao; Li, Li; Liu, Yang; Zhao, Jianjun; Wang, Yadong

doi:10.48550/arxiv.1805.05206

Cited by 17 publications

(14 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to test our hypothesis (and develop a practical algorithm), we need a systematic way of generating mutants of a given DNN model. We adopt the method developed in [26], which is a proposal of applying mutation testing to DNN. Mutation testing [19] is a well-known technique to evaluate the quality of a test suiteand, and thus is different from our work.…”

Section: A Mutating Deep Neural Networkmentioning

confidence: 99%

“…Given the difference between traditional software systems and DNN, mutation operators designed for traditional programs cannot be directly applied to DNN. In [26], Ma et al introduced a set of mutation operators for DNN-based systems at different levels like source level (e.g., the training data and training programs) and model level (e.g., the DNN model).…”

Section: A Mutating Deep Neural Networkmentioning

confidence: 99%

“…In this work, we require a large group of slightly mutated models for runtime adversarial sample detection. Of all the mutation operators proposed in [26], mutation operators defined at the source level are not considered. The reason is that we would need to train the mutated models from scratch which is often time-consuming.…”

Section: A Mutating Deep Neural Networkmentioning

confidence: 99%

“…Once detected, it can be rejected or checked depending on different applications. Our detection algorithm integrates mutation testing of DNN models [26] and statistical hypothesis testing [3]. It is designed based on the observation that adversarial samples are much more sensitive to mutation on the DNN than normal samples, i.e., if we mutate the DNN slightly, the mutated DNN is more likely to change the label on the adversarial sample than that on the normal one.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

Wang

Dong

Sun

et al. 2019

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)

185

155

View full text Add to dashboard Cite

Deep neural networks (DNN) have been shown to be useful in a wide range of applications. However, they are also known to be vulnerable to adversarial samples. By transforming a normal sample with some carefully crafted human imperceptible perturbations, even highly accurate DNN make wrong decisions. Multiple defense mechanisms have been proposed which aim to hinder the generation of such adversarial samples. However, a recent work show that most of them are ineffective. In this work, we propose an alternative approach to detect adversarial samples at runtime. Our main observation is that adversarial samples are much more sensitive than normal samples if we impose random mutations on the DNN. We thus first propose a measure of 'sensitivity' and show empirically that normal samples and adversarial samples have distinguishable sensitivity. We then integrate statistical hypothesis testing and model mutation testing to check whether an input sample is likely to be normal or adversarial at runtime by measuring its sensitivity. We evaluated our approach on the MNIST and CIFAR10 datasets. The results show that our approach detects adversarial samples generated by state-of-the-art attacking methods efficiently and accurately.

show abstract

Section: A Mutating Deep Neural Networkmentioning

confidence: 99%

Section: A Mutating Deep Neural Networkmentioning

confidence: 99%

Section: A Mutating Deep Neural Networkmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

Wang

Dong

Sun

et al. 2019

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)

185

155

View full text Add to dashboard Cite

show abstract

“…Ma et al [14] proposed few operators to introduce changes both at data and model level and evaluated quality of test data by analyzing the extent to which the introduced changes could be detected. Similarly many existing works [28] [20] [18] [3], apply various heuristics, mostly based on gradient descent or evolutionary techniques modify the important pixels.…”

Section: Adversarial Testingmentioning

confidence: 99%

Coverage Testing of Deep Learning Models using Dataset Characterization

Mani,

Sankaran,

Tamilselvam

et al. 2019

Preprint

View full text Add to dashboard Cite

Deep Neural Networks (DNNs), with its promising performance, are being increasingly used in safety critical applications such as autonomous driving, cancer detection, and secure authentication. With growing importance in deep learning, there is a requirement for a more standardized framework to evaluate and test deep learning models. The primary challenge involved in automated generation of extensive test cases are: (i) neural networks are difficult to interpret and debug and (ii) availability of human annotators to generate specialized test points.In this research, we explain the necessity to measure the quality of a dataset and propose a test case generation system guided by the dataset properties. From a testing perspective, four different dataset quality dimensions are proposed: (i) equivalence partitioning, (ii) centroid positioning, (iii) boundary conditioning, and (iv) pair-wise boundary conditioning. The proposed system is evaluated on well known image classification datasets such as MNIST, Fashion-MNIST, CIFAR10, CIFAR100, and SVHN against popular deep learning models such as LeNet, ResNet-20, VGG-19. Further, we conduct various experiments to demonstrate the effectiveness of systematic test case generation system for evaluating deep learning models. CCS CONCEPTS• Software and its engineering → Empirical software validation; • Computing methodologies → Machine learning algorithms.

show abstract

Investigating the impact of transient hardware faults on deep learning neural network inference

Rahman,

Laskar,

2024

Software Testing Verif & Rel

View full text Add to dashboard Cite

SummarySafety‐critical applications, such as autonomous vehicles, healthcare, and space applications, have witnessed widespread deployment of deep neural networks (DNNs). Inherent algorithmic inaccuracies have consistently been a prevalent cause of misclassifications, even in modern DNNs. Simultaneously, with an ongoing effort to minimize the footprint of contemporary chip design, there is a continual rise in the likelihood of transient hardware faults in deployed DNN models. Consequently, researchers have wondered the extent to which these faults contribute to DNN misclassifications compared to algorithmic inaccuracies. This article delves into the impact of DNN misclassifications caused by transient hardware faults and intrinsic algorithmic inaccuracies in safety‐critical applications. Initially, we enhance a cutting‐edge fault injector, TensorFI, for TensorFlow applications to facilitate fault injections on modern DNN non‐sequential models in a scalable manner. Subsequently, we analyse the DNN‐inferred outcomes based on our defined safety‐critical metrics. Finally, we conduct extensive fault injection experiments and a comprehensive analysis to achieve the following objectives: (1) investigate the impact of different target class groupings on DNN failures and (2) pinpoint the most vulnerable bit locations within tensors, as well as DNN layers accountable for the majority of safety‐critical misclassifications. Our findings regarding different grouping formations reveal that failures induced by transient hardware faults can have a substantially greater impact (with a probability up to 4 higher) on safety‐critical applications compared to those resulting from algorithmic inaccuracies. Additionally, our investigation demonstrates that higher order bit positions in tensors, as well as initial and final layers of DNNs, necessitate prioritized protection compared to other regions.

show abstract

DeepMutation: Mutation Testing of Deep Learning Systems

Cited by 17 publications

References 0 publications

Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

Coverage Testing of Deep Learning Models using Dataset Characterization

Investigating the impact of transient hardware faults on deep learning neural network inference

Contact Info

Product

Resources

About