Abstract-We address the problem of learning robust and efficient multi-view object detectors for surveillance video indexing and retrieval. Our philosophy is that effective solutions for this problem can be obtained by learning detectors from huge amounts of training data. Along this research direction, we propose a novel approach that consists of strategically partitioning the training set and learning a large array of complementary, compact, deep cascade detectors. At test time, given a video sequence captured by a fixed camera, a small number of detectors is automatically selected per image location. We demonstrate our approach on the problem of vehicle detection in challenging surveillance scenarios, using a large training dataset composed of around one million images. Our system runs at an impressive average rate of 125 frames per second on a conventional laptop computer.