In order to realize single fault detection (SFD) from the multi-fault coupling bearing data and further research on the multi-fault situation of bearings, this paper proposes a method based on features self-extraction of a Sparse Auto-Encoder (SAE) and results fusion of improved Dempster-Shafer evidence theory (D-S). Multi-fault signal compression features of bearings were extracted by SAE on multiple vibration sensors' data. Data sets were constructed by the extracted compression features to train the Support Vector Machine (SVM) according to the rule of single fault detection (R-SFD) this paper proposed. Fault detection results were obtained by the improved D-S evidence theory, which was implemented via correcting the 0 factor in the Basic Probability Assignment (BPA) and modifying the evidence weight by Pearson Correlation Coefficient (PCC). Extensive evaluations of the proposed method on the experiment platform datasets showed that the proposed method could realize single fault detection from multi-fault bearings. Fault detection accuracy increases as the output feature dimension of SAE increases; when the feature dimension reached 200, the average detection accuracy of the three sensors for bearing inner, outer, and ball faults achieved 87.36%, 87.86% and 84.46%, respectively. The three types' fault detection accuracy-reached to 99.12%, 99.33% and 98.46% by the improved Dempster-Shafer evidence theory (IDS) to fuse the sensors' results-is respectively 0.38%, 2.06% and 0.76% higher than the traditional D-S evidence theory. That indicated the effectiveness of improving the D-S evidence theory by evidence weight calculation of PCC.Entropy 2019, 21, 687 2 of 21 extracted features by Wavelet Packet Transformation, selected features through automatic encoder, and classified them by SVM. The proposed approach achieved very high accuracy and robustness to classify ball, inner race, and outer race faults even under very poor signal-to-noise ratio (SNR) conditions. Due to the unavoidable limitation in terms of accuracy and robustness in the feature extraction approaches of traditional shallow learning, Lu [6] investigated the deep learning method based on a convolutional neural network (CNN), the novel feature representation method for bearing data using supervised deep learning with the goal of identifying more robust and salient feature representations to reduce information loss, and two experiments had proved the efficiency of the proposed method. Wei [7] proposed a new adaptive features selection technique applied to the bearing fault diagnosis with affinity propagation clustering, and results demonstrated that the approach is able to reliably and accurately identify different fault categories and severities of bearings. For the studies shown above and others [8][9][10][11], they all aimed at the failure of one single point on the bearings; the current methods of feature automatic extraction and automatic classification didn't consider the failure conditions of multiple faults combined. Actually, current researche...