Surface defect detection aims to classify and locate a certain defect that exists in the target surface area. It is an important part of industrial quality inspection. Most of the research on surface defect detection are currently based on convolutional neural networks (CNNs), which are more concerned with local information and lack global perception. Thus, CNNs are unable to effectively extract the defect features. In this paper, a defect detection method based on the Swin transformer is proposed. The structure of the Swin transformer has been fine-tuned so that it has five scales of output, making it more suitable for defect detection tasks with large variations in target size. A bi-directional feature pyramid network is used as the feature fusion part to efficiently fuse to the extracted features. The focal loss is used as a loss function to weight the hard- and easy-to-distinguish samples, potentially making the model fit the surface defect data better. To reduce the number of parameters in the model, a shared detection head was chosen for result prediction. Experiments were conducted on the flange surface defect dataset and the steel surface defect dataset, respectively. Compared with the classical CNNs target detection algorithm, our method improves the mean average precision (mAP) by about 15.4%, while the model volume and detection speed are essentially the same as those of the CNNs-based method. The experimental results show that our proposed method is more competitive compared with CNNs-based methods and has some generality for different types of defects.
With the advent and rapid development of image tampering technology, it has become harmful to many aspects of our society. Thus, image tampering detection has been increasingly important. Although current forgery detection methods have achieved some success, the scale of the tampered areas in each forgery image are different, and previous methods do not take this into account. In this paper, we believe that the inability of the network to accommodate tampered regions of various sizes is the main reason for the low precision. To address the mentioned problem, we propose a neural network architecture called CAU-Net, which adds residual propagation and feedback, attention gate and Atrous Spatial Pyramid Pooling with CBAM to the U-Net. The Atrous Spatial Pyramid Pooling with CBAM can capture information from multiple scales and adapt to differently sized target areas. In addition, CAU-Net can solve the vanishing gradient issue and suppress the weight of untampered regions, and CAU-Net is an end-to-end network without redundant image processing; thus, it is fast to detect suspicious images. In the end, we optimize the proposed network structure by ablation study, and the experimental results and visualization results demonstrate that our network has a better performance on CASIA and NIST16 compared with state of the art methods.
Existing end-to-end cloud registration methods are often inefficient and susceptible to noise. We propose an end-to-end point cloud registration network model, Point Transformer for Registration Network (PTRNet), that considers local and global features to improve this behavior. Our model uses point clouds as inputs and applies a Transformer method to extract their global features. Using a K-Nearest Neighbor (K-NN) topology, our method then encodes the local features of a point cloud and integrates them with the global features to obtain the point cloud’s strong global features. Comparative experiments using the ModelNet40 data set show that our method offers better results than other methods, with a mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) between the ground truth and predicted values lower than those of competing methods. In the case of multi-object class without noise, the rotation average absolute error of PTRNet is reduced to 1.601 degrees and the translation average absolute error is reduced to 0.005 units. Compared to other recent end-to-end registration methods and traditional point cloud registration methods, the PTRNet method has less error, higher registration accuracy, and better robustness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.