Synthetic data augmentation for surface defect detection and classification using deep learning

Jain, Saksham; Seth, Gautam; Paruthi, Arpit; Soni, Umang; Kumar, Girish

doi:10.1007/s10845-020-01710-x

Cited by 139 publications

(46 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…However, this system has few limitation, since the synthetic defect data were generated based on the knowledge of the experts, the classifier fails to detect unknown defects. Jain et al [9] suggested a data augmentation method using Generative adversarial networks to generate synthetic data then they used Convolutional Neural Network to classify surface defects in hot-rolled steel strips . Shon et al [10] proposed automatic data augmentation with : rotation, flipping, shifting, shearing range, and zooming techniques and deep learning method to identify defects of wafer.…”

Section: Defect Detection Methods With Data Augmentationmentioning

confidence: 99%

A sight on defect detection methods for imbalanced industrial data

Chaabi

Hamlich

2022

ITM Web Conf.

View full text Add to dashboard Cite

Product defect detection is a challenging task, especially in situations where is difficult and costly to collect defect samples. Which make it quite difficult to apply supervised algorithms as their performances decrease by training the model on imbalanced data. To tackle this problem, researchers used data augmentation and one-class classification to detect defects in industrial areas. In this paper, we list defect detection applications for imbalanced industrial data and we report the benefits and limitation of those methods.

show abstract

Section: Defect Detection Methods With Data Augmentationmentioning

confidence: 99%

A sight on defect detection methods for imbalanced industrial data

Chaabi

Hamlich

2022

ITM Web Conf.

View full text Add to dashboard Cite

show abstract

“…By generating artificial data that are similar to the original data, and thus augmenting the training dataset, GANs can be used for data augmentation. GANs, for example, in the papers by Shao et al [22], Ortego et al [19], and more recently in Jain et al [12] are intended to produce realistic synthesized signals with labels for further use in machine fault diagnosis. A limitation of such methods is that all augmented samples produced may not be physically plausible and may show unrealistic artifacts.…”

Section: Background and Related Work 21 Small Sample ML And Data Augm...mentioning

confidence: 99%

Testing of Machine Learning Models with Limited Samples: An Industrial Vacuum Pumping Application

Chatterjee¹,

Ahmed²,

Hallin³

et al. 2022

Preprint

View full text Add to dashboard Cite

There is often a scarcity of training data for machine learning (ML) classification and regression models in industrial production, especially for time-consuming or sparsely run manufacturing processes. Traditionally, a majority of the limited ground-truth data is used for training, while a handful of samples are left for testing. In that case, the number of test samples is inadequate to properly evaluate the robustness of the ML models under test (i.e., the system under test) for classification and regression. Furthermore, the output of these ML models may be inaccurate or even fail if the input data differ from the expected. This is the case for ML models used in the Electroslag Remelting (ESR) process in the refined steel industry to predict the pressure in a vacuum chamber. A vacuum pumping event that occurs once a workday generates a few hundred samples in a year of pumping for training and testing. In the absence of adequate training and test samples, this paper first presents a method to generate a fresh set of augmented samples based on vacuum pumping principles. Based on the generated augmented samples, three test scenarios and one test oracle are presented to assess the robustness of an ML model used for production on an industrial scale. Experiments are conducted with real industrial production data obtained from Uddeholms AB steel company. The evaluations indicate that Ensemble and Neural Network are the most robust when trained on augmented data using the proposed testing strategy. The evaluation also demonstrates the proposed method's effectiveness in checking and improving ML algorithms' robustness in such situations. The work improves software testing's state-of-the-art robustness testing in similar settings. Finally, the paper presents an MLOps implementation of the proposed approach for real-time ML model prediction and action on the edge

show abstract

“…Therefore they trained their GAN with 2000 original images. Jain et al (2020) evaluated different GAN based techniques for augmenting an image dataset for training a CNN classifier for the detection of defects on metallic surfaces. They first applied a Geometrical Transformation to generate a set of 9000 images.…”

Section: Review On Training Data Sets From Related Researchmentioning

confidence: 99%

Synthetic image data augmentation for fibre layup inspection processes: Techniques to enhance the data set

et al. 2021

View full text Add to dashboard Cite

In the aerospace industry, the Automated Fiber Placement process is an established method for producing composite parts. Nowadays the required visual inspection, subsequent to this process, typically takes up to 50% of the total manufacturing time and the inspection quality strongly depends on the inspector. A Deep Learning based classification of manufacturing defects is a possibility to improve the process efficiency and accuracy. However, these techniques require several hundreds or thousands of training data samples. Acquiring this huge amount of data is difficult and time consuming in a real world manufacturing process. Thus, an approach for augmenting a smaller number of defect images for the training of a neural network classifier is presented. Five traditional methods and eight deep learning approaches are theoretically assessed according to the literature. The selected conditional Deep Convolutional Generative Adversarial Network and Geometrical Transformation techniques are investigated in detail, with regard to the diversity and realism of the synthetic images. Between 22 and 166 laser line scan sensor images per defect class from six common fiber placement inspection cases are utilised for tests. The GAN-Train GAN-Test method was applied for the validation. The studies demonstrated that a conditional Deep Convolutional Generative Adversarial Network combined with a previous Geometrical Transformation is well suited to generate a large realistic data set from less than 50 actual input images. The presented network architecture and the associated training weights can serve as a basis for applying the demonstrated approach to other fibre layup inspection images.

show abstract

Synthetic data augmentation for surface defect detection and classification using deep learning

Cited by 139 publications

References 33 publications

A sight on defect detection methods for imbalanced industrial data

A sight on defect detection methods for imbalanced industrial data

Testing of Machine Learning Models with Limited Samples: An Industrial Vacuum Pumping Application

Synthetic image data augmentation for fibre layup inspection processes: Techniques to enhance the data set

Contact Info

Product

Resources

About