The low resolution and less feature information of small targets make it difficult to recognize and locate, which greatly hinders the improvement of object detection accuracy. In this paper, an object detection model (TDFP) based on CNN and transformer was established, which combines local and global context to establish the connection between features. In the proposed transformed dynamic feature pyramid network, a transformer module was designed to dynamically transform and fuse the multi-scale features generated by the backbone to generate a transformed feature pyramid with richer multi-scale features and context information. In this transformation process, gate block is used to dynamically select single-scale transformation or cross-scale transformation to achieve optimal style of transformation and fusion of multiscale features. The experimental results show that the model improves the small targets detection accuracy based on CNN and transformer. Based on the backbone ResNeXt-101, TDFP achieves 46.2% AP and 26.3% APS on MS COCO, and takes the amount of computation as a loss constraint to achieve a better balance between detection accuracy and computational complexity.
Coal and gas outburst is an urgent and constantly perplexing problem with coal resource extraction, threatening coal mine safe and sustainable production severely. Its mechanism and the participation of gas in coal breaking are still unclear. To explore this problem, in this paper, gas desorption-diffusion regularity of bituminous coal with different particle sizes and its influence on outburst-coal breaking were investigated through mercury intrusion porosimetry (MIP) tests, isothermal adsorption tests, and desorption-diffusion tests for coal particles with different sizes. The results indicated that the cumulative diffusion amount (Qt) and rate (Qt/Q∞), the effective diffusion coefficient (D′), and the kinetic diffusion parameter (υ) decreased as particle size increased. That meant gas was easier to desorb and diffuse from the smaller coal blocks, consequently making coal break into more tiny particles and accelerating gas desorption. As a result, a positive feedback effect that coal breaks continuously and gas releases rapidly and abundantly was formed in a short time when outbursts started, which caused gas release in quantities and promoted the occurrence of outbursts. The findings of this study enhance our understanding of the mechanism of gas participating in coal fragmentation during outbursts, which are significantly conducive to the prevention and control of coal mine disasters and sustainable production of coal resources.
The occlusion problem is one of the fundamental problems of computer vision, especially in the case of non-rigid objects with variable shapes and complex backgrounds, such as humans. With the rise of computer vision in recent years, the problem of occlusion has also become increasingly visible in branches such as human pose estimation, where the object of study is a human being. In this paper, we propose a two-stage framework that solves the human de-occlusion problem. The first stage is the amodal completion stage, where a new network structure is designed based on the hourglass network, and a large amount of prior information is obtained from the training set to constrain the model to predict in the correct direction. The second phase is the content recovery phase, where visible guided attention (VGA) is added to the U-Net with a symmetric U-shaped network structure to derive relationships between visible and invisible regions and to capture information between contexts across scales. As a whole, the first stage is the encoding stage, and the second stage is the decoding stage, and the network structure of each stage also consists of encoding and decoding, which is symmetrical overall and locally. To evaluate the proposed approach, we provided a dataset, the human occlusion dataset, which has occluded objects from drilling scenes and synthetic images that are close to reality. Experiments show that the method has high performance in terms of quality and diversity compared to existing methods. It is able to remove occlusions in complex scenes and can be extended to human pose estimation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.