State-of-art object detection networks like YOLO, SSD and Faster R-CNN all have achieved great success in object detection. However, these algorithms have a low performance in small object detection. So, we produce the Expanding receptive field YOLO (ERF-YOLO) to deal with this problem. At first, we propose an efficient block which is called expanding receptive field block (ERF-block) to capture more information in larger areas. Base on YOLOv2, we down-sample the low-level location information by ERF-block, and up-sample feature information by deconvolution. Then we further assemble these two parts together to make the prediction. After training the network on VOC dataset, we have a good result with 82.6% mAP (mean Average Precision) which is 4.0% higher than the original YOLOv2 network. Thanks to the efficient block, it takes 62fps to detect one image when the input size is 416×416, which could keep a real-time speed. In addition, we also evaluate the model on a remote sensing dataset which contains many small targets, and it also shows that ours model has a better performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.