How to achieve high-precision detection and real-time deployment of the lithology intelligent identification system has significant engineering implications in the geotechnical, geological, water conservation, and mining disciplines. In this study, a lightweight lithology intelligent identification model is proposed to overcome this problem. The MobileNetV2 model is utilized as the basic backbone network to decrease network operation parameters. Furthermore, channel attention and spatial attention methods are incorporated into the model to improve the network’s extraction of complicated and abstract petrographic elements. In addition, based on the findings of network training, computing power performance, test results, and Grad-CAM interpretability analysis and comparison tests with Resnet101, InceptionV3, and MobileNetV2 models. The training accuracy of the proposed model is 98.59 percent, the training duration is 76 min, and the trained model is just 6.38 megabytes in size. The precision (P), recall (R), and harmonic mean (FI-score) were, respectively, 89.62%, 91.38%, and 90.42%. Compared to the three competing models, the model presented in this work strikes a better balance between lithology recognition accuracy and speed, and it gives greater consideration to the rock feature area. Wider and more uniform, strong anti-interference capability, improved robustness and generalization performance of the model, which can be deployed in real-time on the client or edge devices and has some promotion value.
Underwater images are the most direct and effective ways to obtain underwater information. However, underwater images typically suffer from contrast reduction and colour distortion due to the absorption and scattering of water by light, which seriously limits the further development of underwater visual tasks. Recently, the convolutional neural network has been extensively applied in underwater image enhancement for its powerful local information extraction capabilities, but due to the locality of convolution operation, it cannot capture the global context well. Although the recently emerging Transformer can capture global context, it cannot model local correlations. Cformer is proposed, which is an Unet‐like hybrid network structure. First, a Depth Self‐Calibrated block is proposed to extract the local features of the image effectively. Second, a novel Cross‐Shaped Enhanced Window Transformer block is proposed. It captures long‐range pixel interactions while dramatically reducing the computational complexity of feature maps. Finally, the depth self‐calibrated block and the cross‐shaped enhanced window Transformer block are ingeniously fused to build a global–local Transformer module. Extensive ablation studies are performed on public underwater datasets to demonstrate the effectiveness of individual components in the network. The qualitative and quantitative comparisons indicate that Cformer achieves superior performance compared to other competitive models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.