2017 25th Signal Processing and Communications Applications Conference (SIU) 2017
DOI: 10.1109/siu.2017.7960551
|View full text |Cite
|
Sign up to set email alerts
|

Using deep multiple instance learning for action recognition in still images

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…The motivation of this paper appears to be different from ours and it does not take advantage of the superior capabilities of long-term memory units such as the LSTM units. Only recently, deep multiple-instance learning has been proposed and used in the context of object detection and annotations [7,30,32]. Our work is an extension to the prior literature by combining multiple-instance learning with deep recurrent neural networks which appears to be novel.…”
Section: Literature Reviewmentioning
confidence: 96%
“…The motivation of this paper appears to be different from ours and it does not take advantage of the superior capabilities of long-term memory units such as the LSTM units. Only recently, deep multiple-instance learning has been proposed and used in the context of object detection and annotations [7,30,32]. Our work is an extension to the prior literature by combining multiple-instance learning with deep recurrent neural networks which appears to be novel.…”
Section: Literature Reviewmentioning
confidence: 96%
“…Existing methods typically utilize deep convolutional neural networks (CNNs) to classify actions by directly distilling spatial features from images. For example, Bas et al (2017) [1] first extracted the objects from the images using AlexNet [2] and then performed action recognition of the target regions through a multiinstance learning framework. [3] constructed a multibranch attention network based on VGG16 [4] , incorporating multiple regional attention mechanisms for capturing information from regions of different sizes.…”
Section: Introductionmentioning
confidence: 99%