Finding Tiny Faces

Hu, Pengxiang; Ramanan, Deva

doi:10.1109/cvpr.2017.166

Cited by 701 publications

(623 citation statements)

References 35 publications

(67 reference statements)

Supporting

Mentioning

618

Contrasting

Unclassified

Order By: Relevance

“…Some of the top-performing systems consist of commercial software, thus we did use the deep methods of Hu and Ramanan (2016), that are available as open source with the method of Hu and Ramanan (2016) reporting the latest best performance in FDDB. Additionally, we employ the top performing SVM-based method for learning rigid templates (King 2015), the best weakly and strongly supervised DPM implementations of Mathias et al (2014) and Zhu and Ramanan (2012), along with the best performing exemplarbased technique of Kumar et al (2015) .…”

Section: Face Detectionmentioning

confidence: 99%

A Comprehensive Performance Evaluation of Deformable Face Tracking “In-the-Wild”

et al. 2017

View full text Add to dashboard Cite

Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-theart deformable face tracking pipelines using the recently introduced 300 VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.

show abstract

Section: Face Detectionmentioning

confidence: 99%

A Comprehensive Performance Evaluation of Deformable Face Tracking “In-the-Wild”

et al. 2017

View full text Add to dashboard Cite

show abstract

“…More than 85% of ships have an area smaller than 8000 m 2 , that is, around 80 pixels on a SAR image, which is less than the object size of the ImageNet dataset (more than 80% of objects have sizes between 40 and 140 pixels) [33]. Additionally, the ships which offer AIS information have an average length of 168.3 m. Furthermore, the average area is around 51 pixels which is far less than the area that is able to cause a response on the last convolutional layer of VGG16.…”

Section: Experiments Dataset and Settingsmentioning

confidence: 97%

“…For instance, an object located on land is highly unlikely to be considered a ship, while an object with bright intensity in the ocean area is prone to be affirmed as a positive object. In order to mimic the visual effect of a human being in a computer vision field, context information is always added into the deep neural network to recognize the small-sized objects [27,29,33].…”

Section: Integrating Contextual Informationmentioning

confidence: 99%

Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection

et al. 2017

View full text Add to dashboard Cite

Synthetic aperture radar (SAR) ship detection has been playing an increasingly essential role in marine monitoring in recent years. The lack of detailed information about ships in wide swath SAR imagery poses difficulty for traditional methods in exploring effective features for ship discrimination. Being capable of feature representation, deep neural networks have achieved dramatic progress in object detection recently. However, most of them suffer from the missing detection of small-sized targets, which means that few of them are able to be employed directly in SAR ship detection tasks. This paper discloses an elaborately designed deep hierarchical network, namely a contextual region-based convolutional neural network with multilayer fusion, for SAR ship detection, which is composed of a region proposal network (RPN) with high network resolution and an object detection network with contextual features. Instead of using low-resolution feature maps from a single layer for proposal generation in a RPN, the proposed method employs an intermediate layer combined with a downscaled shallow layer and an up-sampled deep layer to produce region proposals. In the object detection network, the region proposals are projected onto multiple layers with region of interest (ROI) pooling to extract the corresponding ROI features and contextual features around the ROI. After normalization and rescaling, they are subsequently concatenated into an integrated feature vector for final outputs. The proposed framework fuses the deep semantic and shallow high-resolution features, improving the detection performance for small-sized ships. The additional contextual features provide complementary information for classification and help to rule out false alarms. Experiments based on the Sentinel-1 dataset, which contains twenty-seven SAR images with 7986 labeled ships, verify that the proposed method achieves an excellent performance in SAR ship detection.

show abstract

“…We follow the approach in Ref. and use “oversized” templates, whose spatial support includes background pixels surrounding the object of interest, shown as contextualized templates in Figure . It turns out that including massive amounts of surrounding area (such that 99% of the template includes the background), which may capture additional contextual cues, such as shadows from a ground plane, is helpful for finding small objects.…”

Section: Algorithmic Approachmentioning

confidence: 99%

“…We call the method Multiscale Foveal Context (MFC) in our results and refer the reader to Ref. for more quantitative analysis, such as how different ways of encoding context affects performance.…”

Section: Algorithmic Approachmentioning

confidence: 99%

Comparing apples and oranges: Off‐road pedestrian detection on the National Robotics Engineering Center agricultural person‐detection dataset

Pezzementi

Tabor

et al. 2017

Journal of Field Robotics

Self Cite

View full text Add to dashboard Cite

Person detection from vehicles has made rapid progress recently with the advent of multiple high‐quality datasets of urban and highway driving, yet no large‐scale benchmark is available for the same problem in off‐road or agricultural environments. Here we present the National Robotics Engineering Center (NREC) Agricultural Person‐Detection Dataset to spur research in these environments. It consists of labeled stereo video of people in orange and apple orchards taken from two perception platforms (a tractor and a pickup truck), along with vehicle position data from Real Time Kinetic (RTK) GPS. We define a benchmark on part of the dataset that combines a total of 76k labeled person images and 19k sampled person‐free images. The dataset highlights several key challenges of the domain, including varying environment, substantial occlusion by vegetation, people in motion and in nonstandard poses, and people seen from a variety of distances; metadata are included to allow targeted evaluation of each of these effects. Finally, we present baseline detection performance results for three leading approaches from urban pedestrian detection and our own convolutional neural network approach that benefits from the incorporation of additional image context. We show that the success of existing approaches on urban data does not transfer directly to this domain.

show abstract

Finding Tiny Faces

Cited by 701 publications

References 35 publications

A Comprehensive Performance Evaluation of Deformable Face Tracking “In-the-Wild”

A Comprehensive Performance Evaluation of Deformable Face Tracking “In-the-Wild”

Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection

Comparing apples and oranges: Off‐road pedestrian detection on the National Robotics Engineering Center agricultural person‐detection dataset

Contact Info

Product

Resources

About