Human activity recognition requires both visual and temporal cues, making it challenging to integrate these important modalities. The usual schemes for integration are averaging and fixing the weights of both features for all samples. However, how much weight is needed for each sample and modality, is still an open question. A mixture of experts via a gating Convolutional Neural Network (CNN) is one promising architecture for adaptively weighting every sample within a dataset. In this paper, rather than just averaging or using fixed weights, we investigate how a natural associative cortex such as a network integrates expert networks to form a gating CNN scheme. Starting from Red Green Blue color model (RGB) values and optical flows, we show that with proper treatment, the gating CNN scheme works well, indicating future approaches to information integration in future activity recognition.
<p><span lang="EN-GB">Deteksi Covid-19 merupakan tahapan penting untuk mengenali secara dini pasien terduga Covid-19 sehingga dapat dilakukan langkah lanjutan. Salah satu cara pendeteksian adalah melalui citra sinar-x paru. Namun demikian, selain dibutuhkan suatu model algoritma yang dapat menghasilkan akurasi tinggi, komputasi yang ringan merupakan hal yang dibutuhkan sehingga dapat diaplikasikan dalam alat pendeteksi. Model deep CNN dapat melakukan deteksi dengan akurat namun cenderung memerlukan penggunaan memori yang besar. CNN dengan parameter yang lebih sedikit dapat menghemat <em>storage </em></span><span lang="EN-GB">maupun penggunaan memori sehingga dapat berproses secara real time baik berupa alat pendeteksi maupun sistem pengambilan keputusan via <em>cloud</em>. Selain itu, CNN dengan parameter yang lebih kecil juga dapat untuk diaplikasikan pada FPGA dan perangkat keras lainnya yang mempunyai kapasitas memori terbatas. Untuk menghasilkan deteksi COVID-19 pada citra sinar-x paru yang akurat namun komputasinya juga ringan, kami mengusulkan arsitektur CNN kecil namun handal </span><span lang="EN-GB">dengan menggunakan teknik pertukaran <em>channel</em> yang disebut ShuffleNet. Dalam penelitian ini, kami menguji dan membandingkan kemampuan ShuffleNet, EfficientNet, dan ResNet50 karena mempunyai jumlah parameter yang lebih kecil dibanding CNN pada umumnya seperti VGGNet atau FullConv yang menggunakan lapisan konvolusi secara penuh namun mempunyai kemampuan deteksi yang mumpuni. Kami menggunakan 1125 citra sinar-x dan mencapai akurasi 86.93 % dengan jumlah parameter model yang 18.55 kali lebih sedikit dari EfficientNet dan 22.36 kali lebih sedikit dari ResNet50 untuk mendeteksi 3 kategori yaitu Covid-19, Pneumonia, dan normal melalui uji 5-<em>fold crossvalidation</em>. Memori yang diperlukan oleh masing-masing arsitektur CNN tersebut untuk melakukan sekali deteksi berhubungan secara linier dengan jumlah parameternya dimana ShuffleNet hanya memerlukan memori GPU sebesar 0.646 GB atau 0.43 kali dari ResNet50, 0.2 kali dari EfficientNet, dan 0.53 kali dari FullConv. Lebih lanjut, ShuffleNet melakukan deteksi paling cepat yaitu sebesar 0.0027 detik.</span></p><p><span lang="EN-GB"><br /></span></p><p><em><strong><span lang="EN-GB">Abstract</span></strong></em></p><p><em>Covid-19 detection is an important step in identifying early patients with suspected Covid-19 so that further steps can be taken. One way of detection is through pulmonary x-ray images. However, besides requiring an algorithm model that can produce high accuracy, lightweight computation is needed so that it can be applied in a detector. The deep CNN model can detect accurately but tends to require large memory usage. CNN with fewer parameters can save storage and memory usage so that it can process in real time both in the form of detection devices and decision-making systems via the cloud. In addition, CNN with smaller parameters can also be applied to FPGA and other hardware that have limited memory capacity. To produce accurate COVID-19 detection on x-ray images with lightweight computation, we propose a small but reliable CNN architecture using a channel shuffle technique called ShuffleNet. In this study, we tested and compared the capabilities of ShuffleNet, EfficientNet, and ResNet because they have a smaller number of parameters than usual deep CNN, such as VGGNet or FullConv which uses a full convolution layers with a robust detection capability. We used 1125 x-ray images and achieved an accuracy of 86.93% with a number of model parameters of 18.55 times less than EfficientNet and 22.36 times less than ResNet50 to detect 3 categories namely Covid-19, Pneumonia, and normal through the 5-fold cross validation. The memory required by each CNN architecture to perform one detection is linearly related to the number of parameters where ShuffleNet only requires GPU memory of 0.646 GB or 0.43 times that of ResNet50, 0.2 times of EfficientNet, and 0.53 times of FullConv. Furthermore, ShuffleNet performs the fastest detection at 0.0027 seconds. </em></p><p><em><strong><span lang="EN-GB"><br /></span></strong></em></p>
Determinant factors which contribute to the prediction should take into account multivariate analysis for capturing coarse-to-fine contextual information. From the preliminary descriptive analysis, it shows that environmental factor such as UV (ultraviolet) is one of the essential factors that should be considered to observe the COVID-19 epidemic drivers. Moreover, there are education, government, morphological, health, economic, and behavioral factors contributing to the growth of COVID-19. Besides descriptive analysis, in this research, multivariate analysis is considered to provide comprehensive explanations about factors contributing to pandemic dynamics. To achieve rich explanations, visual attribution of explainable Convolution-LSTM is utilized to see high contributing factors responsible for the growth of daily COVID-19 cases. Our model consists of 1 D CNN in the first layer to capture local relationships among variables followed by LSTM layers to capture local dependencies over time. It produces the lowest prediction errors compared to the other existing models. This permits us to employ gradient-based visual attribution for generating saliency maps for each time dimension and variable. These are then used for explaining which variables throughout which period of the interval is contributing for a given time-series prediction, likewise as explaining that during that time intervals were the joint contribution of most vital variables for that prediction. The explanations are useful for stakeholders to make decisions during and post pandemics. The explainable Convolution–LSTMcode is available here: https://github.com/cbasemaster/time-series-attribution .
Assessing the structure and function of organelles in living organisms of the primitive unicellular red algae Cyanidioschyzon merolae on three-dimensional sequential images demands a reliable automated technique in the class imbalance among various cellular structures during mitosis. Existing classification networks with commonly used loss functions were focused on larger numbers of cellular structures that lead to the unreliability of the system. Hence, we proposed a balanced deep regularized weighted compound dice loss (RWCDL) network for better localization of cell organelles. Specifically, we introduced two new loss functions, namely compound dice (CD) and RWCD by implementing multi-class variant dice and weighting mechanism, respectively for maximizing weights of peroxisome and nucleus among five classes as the main contribution of this study. We extended the Unet-like convolution neural network (CNN) architecture for evaluating the ability of our proposed loss functions for improved segmentation. The feasibility of the proposed approach is confirmed with three different large scale mitotic cycle data set with different number of occurrences of cell organelles. In addition, we compared the training behavior of our designed architectures with the ground truth segmentation using various performance measures. The proposed balanced RWCDL network generated the highest area under the curve (AUC) value in elevating the small and obscure peroxisome and nucleus, which is 30% higher than the network with commonly used mean square error (MSE) and dice loss (DL) functions. The experimental results indicated that the proposed approach can efficiently identify the cellular structures, even when the contour between the cells is obscure and thus convinced that the balanced deep RWCDL approach is reliable and can be helpful for biologist to accurately identify the relationship between the cell behavior and structures of cell organelles during mitosis.
This letter describes a network that is able to capture multimodal correlations over arbitrary timestamps. The proposed scheme operates as a complementary, extended network over multimodal CNN. For action recognition, the spatial and temporal streams are vital components of deep Convolutional Neural Network (CNNs), but reducing the occurrence of overfitting and fusing these two streams remain open problems. The existing fusion approach is to average the two streams. To this end, we propose a correlation network with a Shannon fusion to learn a CNN that has already been trained. Long-range video may consist of spatiotemporal correlation over arbitrary times. This correlation can be captured using simple fully connected layers to form the correlation network. This is found to be complementary to the existing network fusion methods. We evaluate our approach on the UCF-101 and HMDB-51 datasets, and the resulting improvement in accuracy demonstrates the importance of multimodal correlation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.