Here, the authors proposed a solution to improve the training performance in limited training data case for human action recognition. The authors proposed three different convolutional neural network (CNN) architectures for this purpose. At first, the authors generated four different channels of information by optical flows and gradients in the horizontal and vertical directions from each frame to apply to three-dimensional (3D) CNNs. Then, the authors proposed three architectures, which are single-stream, two-stream, and four-stream 3D CNNs. In the single-stream model, the authors applied four channels of information from each frame to a single stream. In the two-stream architecture, the authors applied optical flow-x and optical flow-y into one stream and gradient-x and gradient-y to another stream. In the four-stream architecture, the authors applied each one of the information channels to four separate streams. Evaluating the architectures in an action recognition system, the system was assessed on IXMAS, a data set which has been recorded simultaneously by five cameras. The authors showed that the results of four-stream architecture were better than other architectures, achieving 87.5, 91.66, 91.11, 88.05, and 81.94% recognition rates for cameras 0-4, respectively, using four-stream structure (88.05% recognition rate in average).
U-Net based algorithms, due to their complex computations, include limitations when they are used in clinical devices. In this paper, we addressed this problem through a novel U-Net based architecture that called fast and accurate U-Net for medical image segmentation task. The proposed fast and accurate U-Net model contains four tuned 2D-convolutional, 2D-transposed convolutional, and batch normalization layers as its main layers. There are four blocks in the encoder-decoder path. The results of our proposed architecture were evaluated using a prepared dataset for head circumference and abdominal circumference segmentation tasks, and a public dataset (HC18-Grand challenge dataset) for fetal head circumference measurement. The proposed fast network significantly improved the processing time in comparison with U-Net, dilated U-Net, R2U-Net, attention U-Net, and MFP U-Net. It took 0.47 seconds for segmenting a fetal abdominal image. In addition, over the prepared dataset using the proposed accurate model, Dice and Jaccard coefficients were 97.62% and 95.43% for fetal head segmentation, 95.07%, and 91.99% for fetal abdominal segmentation. Moreover, we have obtained the Dice and Jaccard coefficients of 97.45% and 95.00% using the public HC18-Grand challenge dataset. Based on the obtained results, we have concluded that a fine-tuned and a simple well-structured model used in clinical devices can outperform complex models.
We introduced Double Attention Res-U-Net architecture to address medical image segmentation problem in different medical imaging system. Accurate medical image segmentation suffers from some challenges including, difficulty of different interest object modeling, presence of noise, and signal dropout throughout the measurement. The base line image segmentation approaches are not sufficient for complex target segmentation throughout the various medical image types. To overcome the issues, a novel U-Net-based model proposed that consists of two consecutive networks with five and four encoding and decoding levels respectively. In each of networks, there are four residual blocks between the encoder-decoder path and skip connections that help the networks to tackle the vanishing gradient problem, followed by the multi-scale attention gates to generate richer contextual information. To evaluate our architecture, we investigated three distinct data-sets, (i.e., CVC-ClinicDB dataset, Multi-site MRI dataset, and a collected ultrasound dataset). The proposed algorithm achieved Dice and Jaccard coefficients of 95.79%, 91.62%, respectively for CRL, and 93.84% and 89.08% for fetal foot segmentation. Moreover, the proposed model outperformed the state-of-the-art U-Net based model on the external CVC-ClinicDB, and multi-site MRI datasets with Dice and Jaccard coefficients of 83%, 75.31% for CVC-ClinicDB, and 92.07% and 87.14% for multi-site MRI dataset, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.