2019
DOI: 10.1109/tip.2019.2917283
|View full text |Cite
|
Sign up to set email alerts
|

Dense Dilated Network for Video Action Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 33 publications
(7 citation statements)
references
References 32 publications
0
7
0
Order By: Relevance
“…e simulation test shows that the recognition algorithm can accurately determine the damage location and damage degree, and the result is stable. A sports video recognition model combining multiple features and NN was proposed in [18,19]. It extracts the static and dynamic features that reflect the sports video, classifies them separately using the RBF neural network, constructs the basic probability assignment of the preliminary recognition results, and uses the evidence theory to fuse the preliminary results to obtain the sports video recognition results.…”
Section: Related Workmentioning
confidence: 99%
“…e simulation test shows that the recognition algorithm can accurately determine the damage location and damage degree, and the result is stable. A sports video recognition model combining multiple features and NN was proposed in [18,19]. It extracts the static and dynamic features that reflect the sports video, classifies them separately using the RBF neural network, constructs the basic probability assignment of the preliminary recognition results, and uses the evidence theory to fuse the preliminary results to obtain the sports video recognition results.…”
Section: Related Workmentioning
confidence: 99%
“…Numerous studies have thus far been conducted in the field of computer vision on humanaction recognition using video data sets, and state-of-the-art accuracy is being updated with new studies reported frequently [30][31][32][33] ; however, these technologies have not been widely applied to the field of surgery. To our knowledge, this is the first study to use 3-D CNN for actual laparoscopic surgical skill assessment.…”
Section: Discussionmentioning
confidence: 99%
“…Dilation convolution [25] in action recognition usually adopted to model temporal features and extract larger contextual information. In [26], a dense dilated network was trained to recognize actions from clip-level to global-level, by fusing outputs from each densely-connected dilated convolutions layer. In temporal aggregation network (TAN) [27], a dedicated temporal aggregation block was designed to encode multi-scale spatio-temporal patterns, and larger temporal context can be captured by dilated convolutions effectively.…”
Section: Dilation Convolution Networkmentioning
confidence: 99%