In the field of automatic target recognition and tracking, long-term tracking for aerial infrared target has been recently seen with great interest. Although deep trackers and correlation filtering trackers offer competitive results on performance, the problems of deformation, abrupt motion, heavy occlusion, and out of view still remain unsolved. In addition, since this paper focus on infrared images, it is also important to consider that infrared images have a significant drawbacks, such as low resolution, low contrast, and lack of textures. In this paper, we adopt correlation filtering trackers and deep learning detection method to achieve accurate tracking results. Our tracking system composed of three parts: the DTB correlation filtering tracker (DTB-CF), a better regression model to discriminate the target from the background with adjustable Gaussian window functions; the UTA correlation filtering tracker (UTA-CF), an optimum regression model to update the target appearance with simultaneously optimal in position, scale, and integration of multi-feature fusion; and the YOLOv3 re-detector, which ensures re-location of the correct position of the target when the tracking fails. In addition, we introduce the ratio between average peak-to-correlation energy (APCE) of the current frame and average APCE of former frames as a criterion to update the UTA-CF tracker to maintain the target model stability. And we combine the nearest neighbor maximum value method with APCE as criterion together to initialize the YOLOv3 re-detector. We evaluate our algorithm on real aerial infrared target thermal image sequences in terms of precision plot, success plot, and speed. The experimental results show that our method has a significant improvement than the state-of-the-art methods for long-term tracking both in accuracy and robustness for aerial infrared object tracking. INDEX TERMS Aerial infrared object tracking, correlation filtering, deep learning detection, multi-feature fusion, APCE criterion. The associate editor coordinating the review of this manuscript and approving it for publication was Shangce Gao. we focus on the problem of long-term aerial infrared object tracking. Compared with visual tracking, infrared object tracking is more challenging. It needs not only to solve the problem of universal tracking (e.g., deformation, abrupt motion, heavy occlusion and out of view), but also to consider its significant defects with low resolution, low contrast and lack of textures. An effective and real-time tracking algorithm should be able to consistently track the infrared object for a long time without failing under these situations. At present, the mainstream methods of tracking algorithms are based on two types: the first is the traditional correlation filtering method, and the other is the convolutional neural network method. The convolutional neural network method has powerful capability of feature extraction.