2014 IEEE Conference on Computer Vision and Pattern Recognition 2014
DOI: 10.1109/cvpr.2014.276
|View full text |Cite
|
Sign up to set email alerts
|

Scalable Object Detection Using Deep Neural Networks

Abstract: Deep convolutional neural networks have recently achieved state-of-the-art performance on a number of image recognition benchmarks, including the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC-2012). The winning model on the localization sub-task was a network that predicts a single bounding box and a confidence score for each object category in the image. Such a model captures the whole-image context around the objects but cannot handle multiple instances of the same object in the image without nai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
588
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 973 publications
(590 citation statements)
references
References 12 publications
1
588
0
1
Order By: Relevance
“…achieving high object recall with less number of bounding boxes, preferably with a small computational overhead and the potential to scale to hundreds of object categories [35,40,37]. Here, discriminative methods based on deep learning models have helped improve the ranking quality of proposal approaches [7,37,28,32]. Inspired by this work, we extend the use of deep and recurrent networks to temporal action proposal generation by introducing a new architecture.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…achieving high object recall with less number of bounding boxes, preferably with a small computational overhead and the potential to scale to hundreds of object categories [35,40,37]. Here, discriminative methods based on deep learning models have helped improve the ranking quality of proposal approaches [7,37,28,32]. Inspired by this work, we extend the use of deep and recurrent networks to temporal action proposal generation by introducing a new architecture.…”
Section: Related Workmentioning
confidence: 99%
“…This problem can be solved by alternating between solving the assignment problem for a given θ k and back-propagating errors given an optimal assignment x k . For simplicity we rely on a heuristic similar to [7] to relax the assignment problem by introducing K anchor segments…”
Section: Inference and Learningmentioning
confidence: 99%
“…This is especially critical for larger sizes of visual words, which are in turn need to obtain good clustering results. However, once RFCs are computed during the learning step, the proposed multimodal detector runs in about two frames per second using Matlab and MEX files 10 .…”
Section: Real Experiments -Face and Object Detectionmentioning
confidence: 99%
“…There exist a vast amount of methods that are able to detect and identify objects in images with striking results, in spite of diverse factors that make difficult this problem such M. Villamizar, A. Garrell, A. Sanfeliu and F. Moreno Institut de Robòtica i Informàtica Industrial, CSIC-UPC. E-mail: {mvillami,agarrell,sanfeliu,fmoreno}@iri.upc.edu as lighting changes, scaling, cluttered backgrounds, object deformations, general 3D rotations, and intra-class variations [10,11,15,26,38].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation