Data compression is a useful method to reduce the communication energy consumption in wireless sensor networks (WSNs). Most existing neural network compression methods focus on improving the compression and reconstruction accuracy (i.e., increasing parameters and layers), ignoring the computation consumption of the network and its application ability in WSNs. In contrast, we pay attention to the computation consumption and application of neural networks, and propose an extremely simple and efficient neural network data compression model. The model combines the feature extraction advantages of Convolutional Neural Network (CNN) with the data generation ability of Variational Autoencoder (VAE) and Restricted Boltzmann Machine (RBM), we call it CBN-VAE. In particular, we propose a new efficient convolutional structure: Downsampling-Convolutional RBM (D-CRBM), and use it to replace the standard convolution to reduce parameters and computational consumption. Specifically, we use the VAE model composed of multiple D-CRBM layers to learn the hidden mathematical features of the sensing data, and use this feature to compress and reconstruct the sensing data. We test the performance of the model by using various real-world WSN datasets. Under the same network size, compared with the CNN, the parameters of CBN-VAE model are reduced by 73.88% and the floating-point operations (FLOPs) are reduced by 96.43% with negligible accuracy loss. Compared with the traditional neural networks, the proposed model is more suitable for application on nodes in WSNs. For the Intel Lab temperature data, the average Signal-to-Noise Ratio (SNR) value of the model can reach 32.51 dB, the average reconstruction error value is 0.0678 °C. The node communication energy consumption can be reduced by 95.83%. Compared with the traditional compression methods, the proposed model has better compression and reconstruction accuracy. At the same time, the experimental results show that the model has good fault detection performance and anti-noise ability. When reconstructing data, the model can effectively avoid fault and noise data.