In the past few decades, target tracking algorithm has been paid great attention by peers at home and abroad in the field of computer vision because of its potential for in-depth research and practical value. Typical applications of target tracking algorithms include intelligent video surveillance, autonomous vehicles, human-computer interaction and so on. Given the initial state of a target object, the task of the target tracking algorithm is to estimate the state of the target in the subsequent video. Despite years of efforts, designing a target tracking algorithm is still a very challenging problem, because it poses changes, particularly illumination changes, and in addition, occlusion, complex environments, and the moving background will also cause changes in the appearance of the target. The traditional target tracking algorithm based on manually designed features or shallow classifiers uses manually designed low-level visual features or shallow classifiers to build the target apparent model, so the semantic information prediction ability of the target apparent model is limited. Given the defect that the above traditional target tracking algorithm is difficult to capture the semantic information of visual data in the target apparent model, inspired by the great success of deep convolution networks in image classification and speech recognition, a target tracking algorithm based on convolution neural network is proposed in this paper.