“…Video object tracking, which refers to continuously tracking the state of an object in subsequent frame sequences by using the initial position and scale information of the object, is the basis for high-level visual tasks such as visual inspection, visual navigation, and visual servo (Nousi et al, 2020 ; Wang et al, 2020 ; Karakostas et al, 2021 ; Sun et al, 2021 ). In engineering practice, interference such as changes in the posture and scale of the object, noise interference, background occlusion, or variation of light conditions may lead to tracking failure, so object tracking remains a challenging task (Zhang et al, 2020 ; Zhang H. et al, 2021 ; Liu et al, 2022 ).…”