Artificial intelligence (AI) can potentially improve the reliability of transformer protection by fusing multiple features. However, owing to the data scarcity of inrush current and internal fault, the existing methods face the problem of poor generalizability. In this paper, a denoising-classification neural network (DCNN) is proposed, one which integrates a convolutional auto-encoder (CAE) and a convolutional neural network (CNN), and is used to develop a reliable transformer protection scheme by identifying the exciting voltage-differential current curve (VICur). In the DCNN, CAE shares its encoder part with the CNN, where the CNN combines the encoder and a classifier. Based on the interaction of the CAE reconstruction process and the CNN classification process, the CAE regards the saturated features of the VICur as noise and removes them accurately. Consequently, it guides CNN to focus on the unsaturated features of the VICur. The unsaturated part of the VICur approximates an ellipse, and this significantly differentiates between a healthy and faulty transformer. Therefore, the unsaturated features extracted by the CNN help to decrease the data ergodicity requirement of AI and improve the generalizability. Finally, a CNN which is trained well by the DCNN is used to develop a protection scheme. PSCAD simulations and dynamic model experiments verify its superior performance.