Object detection in aerial images is vital for autonomous guidance, navigation and control, and situational awareness. However, there are still many challenges facing researchers in this filed, including the target scales, the perspectives in taking pictures, and the highly complex background. The present paper introduces a robust object detector which is optimized for handling with multi-scale objects and the overhead capturing perspective object instances in aerial images. Firstly, in the feature extraction stage, an effective multi-scale detector (MSD) is designed to search for objects with different scales in feature maps. After that, when detecting a small target from a cluttered background, both the shallow and deep layer features are densely connected by the deconvolution after tackling the issues of low dimensionality in deep layers and inadequate representation of small objects. In the experiments part, we analyze the impacts of the above mentioned components on the model and make a comparison between the method at issue and other state-of-the-art approaches on two publicly-available datasets captured by satellites and high-altitude UAVs. The results show that the proposed method, which is applicable to a wider range of aerial images, is more effective and robust. INDEX TERMS Object detection, aerial images, multi-scale detection, small object relative scale (ORS). I. INTRODUCTION