In recent years, the application of deep learning based on deep convolutional neural networks has gained great success in face detection. However, one of the remaining open challenges is the detection of small-scaled faces. The depth of the convolutional network can cause the projected feature map for small faces to be quickly shrunk, and most detection approaches with scale invariant can hardly handle less than 15$x$15 pixel faces. To solve this problem, we propose a different scales face detector (DSFD) based on Faster R-CNN. The new network can improve the precision of face detection while performing as real-time a Faster R-CNN. First, an efficient multitask region proposal network (RPN), combined with boosting face detection, is developed to obtain the human face ROI. Setting the ROI as a constraint, an anchor is inhomogeneously produced on the top feature map by the multitask RPN. A human face proposal is extracted through the anchor combined with facial landmarks. Then, a parallel-type Fast R-CNN network is proposed based on the proposal scale. According to the different percentages they cover on the images, the proposals are assigned to three corresponding Fast R-CNN networks. The three networks are separated through the proposal scales and differ from each other in the weight of feature map concatenation. A variety of strategies is introduced in our face detection network, including multitask learning, feature pyramid, and feature concatenation. Compared to state-of-the-art face detection methods such as UnitBox, HyperFace, FastCNN, the proposed DSFD method achieves promising performance on popular benchmarks including FDDB, AFW, PASCAL faces, and WIDER FACE.
In this study, a fast object detection algorithm based on binary deep convolution neural networks (CNNs) is proposed. Convolution kernels of different sizes are used to predict classes and bounding boxes of multi-scale objects directly in the last feature map of a deep CNN. In this way, rapid object detection with acceptable precision loss is achieved. In addition, binary quantisation for weight values and input data of each layer is used to squeeze the networks for faster object detection. Compared to full-precision convolution, the proposed binary deep CNNs for object detection results in 62 times faster convolutional operations and 32 times memory saving in theory, what'sm o r e , the proposed method is easy to be implemented in embedded computing systems because of the binary operation for convolution and low memory requirement. Experimental results on Pascal VOC2007 validate the effectiveness of the authors' proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.