“…For a fair comparison with the state-of-the-art MOT methods, we use the reference object detections provided by the benchmark . We train the set to set recognition method (Liu, Yan, and Ouyang 2017) based on the pre-trained GoogLeNet (Szegedy et al 2015) on the training set of MOT2016 to ex- In Table 3, NT is compared with the state-of-the-art methods including EAMTT (Sanchez-Matilla, Poiesi, and Cavallaro 2016), Quad (Son et al 2017), MHT (Kim et al 2015), STAM (Chu et al 2017), NOMT (Choi 2015), AMIR (Sadeghian, Alahi, and Savarese 2017), NLPa (Levinkov et al 2017), FWT (Henschel et al 2017), LMP , INT (Lan et al 2018), and DCCRF (Zhou et al 2018). Our NT method performs on par with the state-ofthe-art trackers (e.g., FWT and LMP) in terms of tracking accuracy.…”