Existing video object cutout systems can only deal with limited cases. They usually require detailed user interactions to segment real-life videos, which often suffer from both inseparable statistics (similar appearance between foreground and background) and temporal discontinuities (e.g. large movements, newly-exposed regions following disocclusion or topology change).In this paper, we present an efficient video cutout system to meet this challenge. A novel directional classifier is proposed to handle temporal discontinuities robustly, and then multiple classifiers are incorporated to cover a variety of cases. The outputs of these classifiers are integrated via another classifier, which is learnt from real examples. The foreground matte is solved by a coherent matting procedure, and remaining errors can be removed easily by additive spatio-temporal local editing. Experiments demonstrate that our system performs more robustly and more intelligently than existing systems in dealing with various input types, thus saving a lot of user labor and time.