“…Based on the model architecture, feature extraction, and feature integration techniques, recent deep trackers can be classified as three categories: CNN-based trackers [29,88,89,90,31,91,32,92,93,34,33,94], CNN-Transformer based trackers [46,47,48,49,50,51,52,53,54,55,56,57] and fully-Transformer based trackers [58,59,60,61,62,63,64]. CNN-based trackers rely solely on a CNN architecture for feature extraction and target detection, while CNN-Transformer based trackers and fully-Transformer based trackers partially and fully rely on a Transformer architecture, respectively.…”