Deep convolutional networks have been developed to detect prohibited items for automated inspection of X-ray screening systems in the transport security system. To our knowledge, the existing frameworks were developed to recognize threats using only baggage security X-ray scans. Therefore, the detection accuracy in other domains of security X-ray scans, such as cargo X-ray scans, cannot be ensured. We propose an object detection method for efficiently detecting contraband items in both cargo and baggage for X-ray security scans. The proposed network, MFA-net, consists of three plug-and-play modules, including the multiscale dilated convolutional module, fusion feature pyramid network, and auxiliary point detection head. First, the multiscale dilated convolutional module converts the standard convolution of the detector backbone to a conditional convolution by aggregating the features from multiple dilated convolutions using dynamic feature selection to overcome the object-scale variant issue. Second, the fusion feature pyramid network combines the proposed attention and fusion modules to enhance multiscale object recognition and alleviate the object and occlusion problem. Third, the auxiliary point detection head adopts an auxiliary head to predict the new keypoints of the bounding box to emphasize the localizability without requiring further ground-truth information. We tested the performance of the MFA-net on two large-scale X-ray security image datasets from different domains: a Security Inspection X-ray (SIXray) dataset in the baggage domain and our dataset, named CargoX, in the cargo domain. Moreover, MFA-net outperformed state-of-the-art object detectors in both domains. Thus, adopting the proposed modules can further increase the detection capability of the current object detectors on X-ray security images.
Several state-of-the-art object detectors have demonstrated outstanding performances by optimizing feature representation through modification of the backbone architecture and exploitation of a feature pyramid. To determine the effectiveness of this approach, we explore the modification of object detectors’ backbone and feature pyramid by utilizing Neural Architecture Search (NAS) and Capsule Network. We introduce two modules, namely, NAS-gate convolutional module and Capsule Attention module. The NAS-gate convolutional module optimizes standard convolution in a backbone network based on differentiable architecture search cooperation with multiple convolution conditions to overcome object scale variation problems. The Capsule Attention module exploits the strong spatial relationship encoding ability of the capsule network to generate a spatial attention mask, which emphasizes important features and suppresses unnecessary features in the feature pyramid, in order to optimize the feature representation and localization capability of the detectors. Experimental results indicate that the NAS-gate convolutional module can alleviate the object scale variation problem and the Capsule Attention network can help to avoid inaccurate localization. Next, we introduce NASGC-CapANet, which incorporates the two modules, i.e., a NAS-gate convolutional module and capsule attention module. Results of comparisons against state-of-the-art object detectors on the MS COCO val-2017 dataset demonstrate that NASGC-CapANet-based Faster R-CNN significantly outperforms the baseline Faster R-CNN with a ResNet-50 backbone and a ResNet-101 backbone by mAPs of 2.7% and 2.0%, respectively. Furthermore, the NASGC-CapANet-based Cascade R-CNN achieves a box mAP of 43.8% on the MS COCO test-dev dataset.
The aim of this study was to develop a novel deep learning (DL) model without requiring large-annotated training datasets for detecting pancreatic cancer (PC) using computed tomography (CT) images. This retrospective diagnostic study was conducted using CT images collected from 2004 and 2019 from 4287 patients diagnosed with PC. We proposed a self-supervised learning algorithm (pseudo-lesion segmentation (PS)) for PC classification, which was trained with and without PS and validated on randomly divided training and validation sets. We further performed cross-racial external validation using open-access CT images from 361 patients. For internal validation, the accuracy and sensitivity for PC classification were 94.3% (92.8–95.4%) and 92.5% (90.0–94.4%), and 95.7% (94.5–96.7%) and 99.3 (98.4–99.7%) for the convolutional neural network (CNN) and transformer-based DL models (both with PS), respectively. Implementing PS on a small-sized training dataset (randomly sampled 10%) increased accuracy by 20.5% and sensitivity by 37.0%. For external validation, the accuracy and sensitivity were 82.5% (78.3–86.1%) and 81.7% (77.3–85.4%) and 87.8% (84.0–90.8%) and 86.5% (82.3–89.8%) for the CNN and transformer-based DL models (both with PS), respectively. PS self-supervised learning can increase DL-based PC classification performance, reliability, and robustness of the model for unseen, and even small, datasets. The proposed DL model is potentially useful for PC diagnosis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.