Background subtraction is a significant component of computer vision systems. It is widely used in video surveillance, object tracking, anomaly detection, etc. A new data source for background subtraction appeared as the emergence of low-cost depth sensors like Microsoft Kinect, Asus Xtion PRO, etc. In this paper, we propose a background subtraction approach on depth videos, which is based on convolutional neural networks (CNNs), called BGSNet-D (BackGround Subtraction neural Networks for Depth video). The method can be used in color unavailable scenarios like poor lighting situations, and can also be applied to combine with existing RGB background subtraction methods. A preprocessing strategy is designed to reduce the influences incurred by noise from depth sensors. The experimental results on the SBM-RGBD dataset show that the proposed method outperforms existing methods on depth data, and even reaches the performance of the methods that use RGB-D data.