To enhance imaging quality and reduce reconstruction time of data-driven single-pixel imaging (SPI) under under-sampling, a two-stage training method is proposed for high-quality SPI. A deep learning algorithm is utilized to simulate the single pixel detection and imaging process. During in the initial training stage, the L
2 regularization constraint is applied to the convolution modulation patterns to obtain the optimal initial network weight. In the second stage, we adopt a coupled deep learning method of coded aperture design and SPI, which utilizes backpropagation of the loss function to iteratively optimize the binarized modulation patterns and the imaging network parameters. This method reduces the binarization errors in dithering algorithm, which improves the quality of data-driven SPI. Furthermore, compared with the traditional deep learning SPI method, this method has fewer parameters and accelerates the image reconstruction process. Experiments and simulations show that this method has the advantages of high imaging quality, short image reconstruction time and simple training. The results at image size of 64×64 pixels and 10% sampling rate show the proposed method achieves a peak signal-to-noise ratio of 23.22 dB, a structural similarity index of 0.76, and image reconstruction time of about 2.57e-4 seconds.