Recognition models achieved exceptional performance in object detection during the computer vision era, but they still rely on Small-Scale Object Detection (SSOD). SSOD remains difficult due to the variety of shapes and orientations of the object, as well as fixed prediction heads at the detection stage. To overcome these challenges, our paper proposed an auto anchor module, Spatial Pyramid Pooling Faster layer (SPPF) in the feature refinement network, and an extremely small-scale prediction head at the detection stage of the model. With the auto anchor feature, the model is able to adjust the anchor boxes dynamically during training, which can improve its overall performance. The SPPF layer divides the feature map into five levels of pyramids, where each level corresponds to a different scale. This enables the model to detect objects of different sizes by pooling features from all the levels of the pyramid. Extreme Small Scale prediction head is specifically designed for small object detection and uses anchor boxes with small sizes and feature re-sampling techniques to improve the accuracy of SSOD. We perform quantitative evaluations on the VOC and achieved 85.4 mAP, 79.8% precision, 80.3% recall, and 80% F1-score. These reliable results prove that the suggested model has greater performance compared with existing models in detecting small-scale objects.