Wide area motion imagery (WAMI) acquired by an airborne multicamera sensor enables continuous monitoring of large urban areas. Each image can cover regions of several square kilometers and contain thousands of vehicles. Reliable vehicle tracking in this imagery is an important prerequisite for surveillance tasks, but remains challenging due to low frame rate and small object size. Most WAMI tracking approaches rely on moving object detections generated by frame differencing or background subtraction. These detection methods fail when objects slow down or stop. Recent approaches for persistent tracking compensate for missing motion detections by combining a detection-based tracker with a second tracker based on appearance or local context. In order to avoid the additional complexity introduced by combining two trackers, we employ an alternative single tracker framework that is based on multiple hypothesis tracking and recovers missing motion detections with a classifierbased detector. We integrate an appearance-based similarity measure, merge handling, vehicle-collision tests, and clutter handling to adapt the approach to the specific context of WAMI tracking. We apply the tracking framework on a region of interest of the publicly available WPAFB 2009 dataset for quantitative evaluation; a comparison to other persistent WAMI trackers demonstrates state of the art performance of the proposed approach. Furthermore, we analyze in detail the impact of different object detection methods and detector settings on the quality of the output tracking results. For this purpose, we choose four different motion-based detection methods that vary in detection performance and computation time to generate the input detections. As detector parameters can be adjusted to achieve different precision and recall performance, we combine each detection method with different detector settings that yield (1) high precision and low recall, (2) high recall and low precision, and (3) best f-score. Comparing the tracking performance achieved with all generated sets of input detections allows us to quantify the sensitivity of the tracker to different types of detector errors and to derive recommendations for detector and parameter choice.