Bearings always suffer from surface defects, such as scratches, black spots, and pits. Those surface defects have great effects on the quality and service life of bearings. Therefore, the defect detection of the bearing has always been the focus of the bearing quality control. Deep learning has been successfully applied to the objection detection due to its excellent performance. However, it is difficult to realize automatic detection of bearing surface defects based on data-driven-based deep learning due to few samples data of bearing defects on the actual production line. Sample preprocessing algorithm based on normalized sample symmetry of bearing is adopted to greatly increase the number of samples. Two different convolutional neural networks, supervised networks and unsupervised networks, are tested separately for the bearing defect detection. The first experiment adopts the supervised networks, and ResNet neural networks are selected as the supervised networks in this experiment. The experiment result shows that the AUC of the model is 0.8567, which is low for the actual use. Also, the positive and negative samples should be labelled manually. To improve the AUC of the model and the flexibility of the samples labelling, a new unsupervised neural network based on autoencoder networks is proposed. Gradients of the unlabeled data are used as labels, and autoencoder networks are created with U-net to predict the output. In the second experiment, positive samples of the supervised experiment are used as the training set. The experiment of the unsupervised neural networks shows that the AUC of the model is 0.9721. In this experiment, the AUC is higher than the first experiment, but the positive samples must be selected. To overcome this shortage, the dataset of the third experiment is the same as the supervised experiment, where all the positive and negative samples are mixed together, which means that there is no need to label the samples. This experiment shows that the AUC of the model is 0.9623. Although the AUC is slightly lower than that of the second experiment, the AUC is high enough for actual use. The experiment results demonstrate the feasibility and superiority of the proposed unsupervised networks.