The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.In this section, a general description about the problem of MOT is provided. The main characteristics and common steps of MOT algorithms are identified and described in section 2.1. The metrics that are usually employed to evaluate the performance of the models are discussed in section 2.2, while the most important benchmark datasets are presented in section 2.3. 7 The website says the detections were obtained using a model based on a latent SVM, or L-SVM. That model is now known as Deformable Parts Model (DPM). 8