2014
DOI: 10.1007/s11263-014-0723-7
|View full text |Cite
|
Sign up to set email alerts
|

Reduced Analytic Dependency Modeling: Robust Fusion for Visual Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(7 citation statements)
references
References 47 publications
0
7
0
Order By: Relevance
“…In spite of that, to get an approximate idea about how GeThR-Net performs compared to these methods, we provide some comparisons. The mAP reported on CCV by some of the recent methods are: 70.6 [40], 64.0 [46], 63.4 [17], 60.3 [43], 68.2 [16], 64.0 [10] and 83.5 [41]. We perform better (mAP of 79.3) than six of these methods.…”
Section: Discussion On Resultsmentioning
confidence: 91%
See 1 more Smart Citation
“…In spite of that, to get an approximate idea about how GeThR-Net performs compared to these methods, we provide some comparisons. The mAP reported on CCV by some of the recent methods are: 70.6 [40], 64.0 [46], 63.4 [17], 60.3 [43], 68.2 [16], 64.0 [10] and 83.5 [41]. We perform better (mAP of 79.3) than six of these methods.…”
Section: Discussion On Resultsmentioning
confidence: 91%
“…The dataset has 393 training, 287 testing, and 276 testing sequences. Each sequence is of duration between 1-2 minutes and contains [8][9][10][11][12][13][14][15][16][17][18][19][20] gestures. Furthermore, the test sequences also have 'distracter' (out of vocabulary) gestures apart from the 20 main gesture categories.…”
Section: Dataset Detailsmentioning
confidence: 99%
“…Ye et al [50] 64.0% Wang et al [44] 85.9% Jhuo et al [12] 64.0% Tran et al [40] 86.7% Ma et al [27] 63.4% Simonyan et al [33] 88.0%…”
Section: Ucf-101mentioning
confidence: 99%
“…Similarly to encoding collections of images, also videos have predominantly been encoded as the mean visual feature of the sampled frames [13,15]. Alternatively, using a Fisher Vector over several low-level video descriptors, such as SIFT, STIP and HOG, has been used [17,18,26].…”
Section: Video Event Recognitionmentioning
confidence: 99%