Various techniques using artificial intelligence (AI) have resulted in a significant contribution to field of medical image and video-based diagnoses, such as radiology, pathology, and endoscopy, including the classification of gastrointestinal (GI) diseases. Most previous studies on the classification of GI diseases use only spatial features, which demonstrate low performance in the classification of multiple GI diseases. Although there are a few previous studies using temporal features based on a three-dimensional convolutional neural network, only a specific part of the GI tract was involved with the limited number of classes. To overcome these problems, we propose a comprehensive AI-based framework for the classification of multiple GI diseases by using endoscopic videos, which can simultaneously extract both spatial and temporal features to achieve better classification performance. Two different residual networks and a long short-term memory model are integrated in a cascaded mode to extract spatial and temporal features, respectively. Experiments were conducted on a combined dataset consisting of one of the largest endoscopic videos with 52,471 frames. The results demonstrate the effectiveness of the proposed classification framework for multi-GI diseases. The experimental results of the proposed model (97.057% area under the curve) demonstrate superior performance over the state-of-the-art methods and indicate its potential for clinical applications.
Medical-image-based diagnosis is a tedious task‚ and small lesions in various medical images can be overlooked by medical experts due to the limited attention span of the human visual system, which can adversely affect medical treatment. However, this problem can be resolved by exploring similar cases in the previous medical database through an efficient content-based medical image retrieval (CBMIR) system. In the past few years, heterogeneous medical imaging databases have been growing rapidly with the advent of different types of medical imaging modalities. Recently, a medical doctor usually refers to various types of imaging modalities all together such as computed tomography (CT), magnetic resonance imaging (MRI), X-ray, and ultrasound, etc of various organs in order for the diagnosis and treatment of specific disease. Accurate classification and retrieval of multimodal medical imaging data is the key challenge for the CBMIR system. Most previous attempts use handcrafted features for medical image classification and retrieval, which show low performance for a massive collection of multimodal databases. Although there are a few previous studies on the use of deep features for classification, the number of classes is very small. To solve this problem, we propose the classification-based retrieval system of the multimodal medical images from various types of imaging modalities by using the technique of artificial intelligence, named as an enhanced residual network (ResNet). Experimental results with 12 databases including 50 classes demonstrate that the accuracy and F1.score by our method are respectively 81.51% and 82.42% which are higher than those by the previous method of CBMIR (the accuracy of 69.71% and F1.score of 69.63%).
Automatic chest anatomy segmentation plays a key role in computer-aided disease diagnosis, such as for cardiomegaly, pleural effusion, emphysema, and pneumothorax. Among these diseases, cardiomegaly is considered a perilous disease, involving a high risk of sudden cardiac death. It can be diagnosed early by an expert medical practitioner using a chest X-Ray (CXR) analysis. The cardiothoracic ratio (CTR) and transverse cardiac diameter (TCD) are the clinical criteria used to estimate the heart size for diagnosing cardiomegaly. Manual estimation of CTR and other diseases is a time-consuming process and requires significant work by the medical expert. Cardiomegaly and related diseases can be automatically estimated by accurate anatomical semantic segmentation of CXRs using artificial intelligence. Automatic segmentation of the lungs and heart from the CXRs is considered an intensive task owing to inferior quality images and intensity variations using nonideal imaging conditions. Although there are a few deep learning-based techniques for chest anatomy segmentation, most of them only consider single class lung segmentation with deep complex architectures that require a lot of trainable parameters. To address these issues, this study presents two multiclass residual mesh-based CXR segmentation networks, X-RayNet-1 and X-RayNet-2, which are specifically designed to provide fine segmentation performance with a few trainable parameters compared to conventional deep learning schemes. The proposed methods utilize semantic segmentation to support the diagnostic procedure of related diseases. To evaluate X-RayNet-1 and X-RayNet-2, experiments were performed with a publicly available Japanese Society of Radiological Technology (JSRT) dataset for multiclass segmentation of the lungs, heart, and clavicle bones; two other publicly available datasets, Montgomery County (MC) and Shenzhen X-Ray sets (SC), were evaluated for lung segmentation. The experimental results showed that X-RayNet-1 achieved fine performance for all datasets and X-RayNet-2 achieved competitive performance with a 75% parameter reduction.
Biometrics using finger-veins is a recognition method based on the shape of veins in fingers, and it has the advantage of difficulty to be forged. However, a shade is inevitably produced due to the bones and fingernails, and a change in illumination occurs when acquiring the images of finger-veins. Previous studies have conducted finger-vein recognition using a single-type texture image or finger-vein segmented image (shape image). A texture image provides numerous features, but it is vulnerable to the changes in illumination during recognition and contains noises in regions other than the finger-vein region. A shape image is less affected by noises; however, the recognition accuracy is significantly reduced due to fewer features available and mis-segmented regions caused by shades. In this study, therefore, rough finger-vein regions in an image are detected to reduce the effect of mis-segmented regions, to complement the drawbacks of shape image-based finger-vein recognition. Furthermore, score-level fusion is performed for two output scores of deep convolutional neural network extracted from the texture and shape images, which can reduce the sensitivity to noise, while diverse features provided in the texture image are used efficiently. Two open databases, the Shandong University homologous multi-modal traits finger-vein database and Hong Kong Polytech University finger image database, are used for experiments, and the proposed method shows better recognition performance than the state-of-the-art method.INDEX TERMS Finger-vein recognition, shape and texture images of finger-vein, deep CNN, score-level fusion.
Age estimation using facial images is applicable in various fields, such as age-targeted marketing, analysis of demand and preference for goods, skin care, remote medical service, and age statistics, for describing a specific place. However, if a low-resolution camera is used to capture the images, or facial images are obtained from the subjects standing afar, the resolution of the images is degraded. In such a case, information regarding wrinkles and the texture of the face are lost, and features that are crucial for age estimation cannot be obtained. Existing studies on age estimation did not consider the degradation of resolution but used only high-resolution facial images. To overcome this limitation, this paper proposes a deep convolutional neural network (CNN)-based age estimation method that reconstructs low-resolution facial images as high-resolution images using a conditional generative adversarial network (GAN), and then uses the images as inputs. An experiment is conducted using two open databases (PAL and MORPH databases). The results demonstrate that the proposed method achieves higher accuracy in high-resolution reconstruction and age estimation than the state-of-the art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.