“…Visual object tracking is an important topic in computer vision, where the target object is identified in the first frame and tracked in all frames of a video. Due to the significant learning ability, deep convolutional neural networks (DCNNs) have been widely used to object detection [34,35,62], image matting [42,43,64], super-resolution [63,67,68], image enhancement [61,65] and visual object tracking [2,[11][12][13]15,19,22,28,32,33,38,47,58,[70][71][72]. However, RGB-based trackers suffer from bad environmental conditions, e.g., low illumination, fast motion, and so on.…”