Cardiac signals are frequently used in disease and emotion analyses. However, current measurement methods mostly require direct contact. Remote photoplethysmography (rPPG) has been proposed in recent years which measures minute variations in color on the face due to blood volume changes as the heart pumps, using a consumer grade camera. In this study, we proposed a deep learning framework based on a light-weight and task-adapted version of U-Net to extract rPPG. The face video was converted into multiscale spatio-temporal map (MSTmap) as input to the network. Two types of attention mechanisms were added, namely variations of the squeeze-and-excitation block (SE block), which compresses global information to enhance the channel and ROI signals, and the multihead attention block with position encoding, which extracts information from different parts of the signal. We further propose using virtual PPG (vPPG) as a replacement for PPG ground-truth so that the model focuses on learning the peak information instead of morphological details. Extensive experiments were conducted using the UBFC-rPPG dataset for heart rate (HR) and heart rate variability (HRV) estimations. The model achieved a root-mean-square error of 0.78 bpm and correlation coefficient of 0.99 in heart rate estimation, which is comparable to state-of-the-art while being more light-weight.INDEX TERMS Attention, remote photoplethysmography, remote heart rate estimation, spatio-temporal map.
Gastric cancer can be classified into different subtypes according to their genetic expression. Microsatellite instability (MSI) is one of these subtypes and an important clinical marker for prognosis and consideration for immunotherapy. Since genetic testing is relatively expensive and laborious, this study tackles the challenge of using deep neural networks (DNNs) to identify MSI based on analyzing histomorphologic features of gastric whole-slide images (WSIs). A two-stage patch-wise framework was proposed, which first differentiates the tumor regions from normal, then predicts MSI status from the tumorous patches. The proposed deep learning architecture enhances the residual attention network with non-local modules and visual context fusion modules, thereby allowing both local fine-grained details and coarse long-range dependencies to be captured. Image post-processing procedures were also proposed to better align the region segmentation with pathologist annotations. The model was applied to a three-way classification task, namely normal tissue, microsatellite stable (MSS), and MSI, using a private dataset gathered by Chang Gung Memorial Hospital and achieved 91.95% slide-wise accuracy. We also studied the feasibility of transfer learning by fine tuning on the TCGA-STAD public dataset, where we attained a high accuracy of 96.53% and an AUC of 0.99, outperforming previous literature.INDEX TERMS Image post-processing, microsatellite instability, non-local neural networks, residual attention network, whole slide image.
We propose an emotion recognition framework based on ResNet, bidirectional long-and short-term memory (BiLSTM) modules, and data augmentation using a ResNet deep convolutional generative adversarial network (DCGAN) with photoplethysmography (PPG) signals as input. The emotions identified in this study were classified into two classes (positive and negative) and four classes (neutral, angry, happy, and sad). The framework achieved high recognition rates of 90.34% and 86.32% in two-and four-class emotion recognition tasks, respectively, outperforming other representative methods. Moreover, we show that the ResNet DCGAN module can synthesize samples that do not just look like those in the training set, but also capture discriminative features of the different classes. The distinguishability of the classes was enhanced when these synthetic samples were added to the original samples, which in turn improved the test accuracy of the model when trained using these mixed samples. This effect was evaluated using various quantitative and qualitative methods, including the inception score (IS), Fréchet inception distance (FID), GAN quality index (GQI), linear discriminant analysis (LDA), and Mahalanobis distance (MD).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.