Recently, Siamese‐based trackers have drawn amounts of attention in visual tracking field because of their excellent performance. However, visual object tracking on Unmanned Aerial Vehicles platform encounters difficulties under circumstances such as small objects and similar objects interference. Most existing tracking methods for aerial tracking adopt deeper networks or inefficient policies to promote performance, but most trackers can hardly meet real‐time requirements on mobile platforms with limited computing resources. Thus, in this work, an efficient and lightweight siamese tracker (MobileTrack) is proposed for high‐time Unmanned Aerial Vehicles tracking, realising the balance between performance and speed. Firstly, a lightweight convolutional network (D‐MobileNet) is designed to enhance the characterisation ability of small objects. Secondly, an efficient object‐aware module is proposed for local cross‐channel information exchange, enhancing the feature information of the tracking object. Besides, an anchor‐free region proposal network is introduced to predict the object pixel by pixel. Finally, deep and shallow feature information is fully utilised by cascading multiple anchor‐free region proposal networks for accurate locating and robust tracking. Extensive experiments on the three Unmanned Aerial Vehicles benchmarks show that the proposed tracker achieves outstanding performance while keeping a beyond‐real‐time speed.