Retinal illnesses such as age-related macular degeneration and diabetic macular edema will lead to irreversible blindness. With optical coherence tomography (OCT), doctors are able to see cross-sections of the retinal layers and provide patients with a diagnosis. Manual reading of OCT images is time-consuming, labor-intensive and even error-prone. Computer-aided diagnosis algorithms improve efficiency by automatically analyzing and diagnosing retinal OCT images. However, the accuracy and interpretability of these algorithms can be further improved through effective feature extraction, loss optimization and visualization analysis. In this paper, we propose an interpretable Swin-Poly Transformer network for performing automatically retinal OCT image classification. By shifting the window partition, the Swin-Poly Transformer constructs connections between neighboring non-overlapping windows in the previous layer and thus has the flexibility to model multi-scale features. Besides, the Swin-Poly Transformer modifies the importance of polynomial bases to refine cross entropy for better retinal OCT image classification. In addition, the proposed method also provides confidence score maps, assisting medical practitioners to understand the models’ decision-making process. Experiments in OCT2017 and OCT-C8 reveal that the proposed method outperforms both the convolutional neural network approach and ViT, with an accuracy of 99.80% and an AUC of 99.99%.
Early detection of tumors has great significance for formative detection and determination of treatment plans. However, cancer detection remains a challenging task due to the interference of diseased tissue, the diversity of mass scales, and the ambiguity of tumor boundaries. It is difficult to extract the features of small-sized tumors and tumor boundaries, so semantic information of high-level feature maps is needed to enrich the regional features and local attention features of tumors. To solve the problems of small tumor objects and lack of contextual features, this paper proposes a novel Semantic Pyramid Network with a Transformer Self-attention, named SPN-TS, for tumor detection. Specifically, the paper first designs a new Feature Pyramid Network in the feature extraction stage. It changes the traditional cross-layer connection scheme and focuses on enriching the features of small-sized tumor regions. Then, we introduce the transformer attention mechanism into the framework to learn the local feature of tumor boundaries. Extensive experimental evaluations were performed on the publicly available CBIS-DDSM dataset, which is a Curated Breast Imaging Subset of the Digital Database for Screening Mammography. The proposed method achieved better performance in these models, achieving 93.26% sensitivity, 95.26% specificity, 96.78% accuracy, and 87.27% Matthews Correlation Coefficient (MCC) value, respectively. The method can achieve the best detection performance by effectively solving the difficulties of small objects and boundaries ambiguity. The algorithm can further promote the detection of other diseases in the future, and also provide algorithmic references for the general object detection field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.