ObjectMix

Kimata, Jun; Nitta, Tomoya; Tamaki, Toru

doi:10.1145/3551626.3564941

Cited by 5 publications

(1 citation statement)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The traditional action recognition datasets are mainly common actions intercepted from major video websites, such as UCF-101 [12], which mainly covers moving-object interaction, humanhuman interaction, playing musical instruments, body movements, and sports. The HMDB51 [13] dataset includes human-object interaction, human-human interaction, facial movements, facial movements of manipulated objects, and body movements. The backgrounds of the above two datasets are dynamic and there are many kinds of actions.…”

Section: Datasets Related To Rescue Operations Have Not Been Reported Inmentioning

confidence: 99%

P‐2.31: An Emergency Rescue Action Recognition Method Based on Improved Spatiotemporal Decomposition Network

Zhang

Tianxiang

2023

Symp Digest of Tech Papers

View full text Add to dashboard Cite

In order to solve the problem of too large computer overhead for convolutional neural network in the rescue action recognition, this paper proposes a rescue action recognition method combining spatiotemporal decomposition network and channel attention mechanism. After adding the channel attention mechanism, the model can generate a weight value for each feature channel, and then weight the normalized weight to each feature channel to improve the recognition accuracy of the model. The combined network model has better performance in both computer overhead and recognition accuracy, and the recognition accuracy has improved compared with the original model (S3D) in the case of only RGB video as input on rescue action data. In addition, the recognition accuracy on the public UCF-101 and KTH datasets has also improved. The experiment results show the proposed method can effectively improve the accuracy of action recognition in both the rescue action and public datasets.

show abstract

Section: Datasets Related To Rescue Operations Have Not Been Reported Inmentioning

confidence: 99%

P‐2.31: An Emergency Rescue Action Recognition Method Based on Improved Spatiotemporal Decomposition Network

Zhang

Tianxiang

2023

Symp Digest of Tech Papers

View full text Add to dashboard Cite

show abstract

TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos

et al. 2023

View full text Add to dashboard Cite

Purpose Automatic recognition of surgical activities from intraoperative surgical videos is crucial for developing intelligent support systems for computer-assisted interventions. Current state-of-the-art recognition methods are based on deep learning where data augmentation has shown the potential to improve the generalization of these methods. This has spurred work on automated and simplified augmentation strategies for image classification and object detection on datasets of still images. Extending such augmentation methods to videos is not straightforward, as the temporal dimension needs to be considered. Furthermore, surgical videos pose additional challenges as they are composed of multiple, interconnected, and long-duration activities. Methods This work proposes a new simplified augmentation method, called TRandAugment, specifically designed for long surgical videos, that treats each video as an assemble of temporal segments and applies consistent but random transformations to each segment. The proposed augmentation method is used to train an end-to-end spatiotemporal model consisting of a CNN (ResNet50) followed by a TCN. Results The effectiveness of the proposed method is demonstrated on two surgical video datasets, namely Bypass40 and CATARACTS, and two tasks, surgical phase and step recognition. TRandAugment adds a performance boost of 1–6% over previous state-of-the-art methods, that uses manually designed augmentations. Conclusion This work presents a simplified and automated augmentation method for long surgical videos. The proposed method has been validated on different datasets and tasks indicating the importance of devising temporal augmentation methods for long surgical videos.

show abstract

Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground

Li,

Liu,

Zhang

et al. 2023

2023 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

ObjectMix

Cited by 5 publications

References 22 publications

P‐2.31: An Emergency Rescue Action Recognition Method Based on Improved Spatiotemporal Decomposition Network

P‐2.31: An Emergency Rescue Action Recognition Method Based on Improved Spatiotemporal Decomposition Network

TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos

Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground

Contact Info

Product

Resources

About