With the recent development of earth observation technology, the satellites have obtained the ability to capture city-scale videos, which enable potential applications in intelligent traffic management. Because of the broad field-of-view, the moving vehicles in satellite videos are usually composed of only tens of pixels, making it difficult to differentiate true objects from noise and other distractors. In addition, the edges of tall building tops are often mistakenly detected as moving vehicles because of the effects of motion parallax. This paper proposed a terse framework that can effectively suppress false targets, achieving high precision and recall. The study involves three parts: 1) An adaptive filtering method is proposed to reduce noise, thus making the detection algorithm more reliable; 2) Several background subtraction models are tested, and the best one is chosen to produce the preliminary detection results at high recall but low accuracy; 3) A lightweight convolutional neural network (LCNN) is designed and trained on a small collection of samples, and then used to eliminate false targets. The experiments and evaluations demonstrate that our method can largely improve the precision at the expense of a slight reduction of recall. INDEX TERMS vehicle detection, object detection, convolutional neural network, background subtraction model, satellite video