Fault diagnosis of rotor systems is important to prevent unexpected failures. Recently, deep learning (DL) methods, such as a convolutional neural network (CNN), have been utilized in many research areas, including fault diagnosis. DL has gained significant attention thanks to its ability to efficiently learn proper features from input data. It is possible to learn enriched hierarchical features by making the DL architectures deeper; therefore, many studies have been conducted to stack the neural networks, which are the basic building blocks of DL, deeper. However, it becomes difficult to comprehensively train neural network architectures as they become deeper, due to problems in the flow of gradient information during the training phase. In this paper, a direct connection based CNN (DC-CNN) method is proposed to significantly improve training efficiency and diagnosis performance. DC-CNN connects feature maps of different layers within a CNN to improve the gradient information flow over the layers. These additional connections, however, can increase the number of trainable parameters within the network. To prevent problems that might be caused by an increased number of parameters, dimension reduction modules are also developed. Moreover, to consider the anisotropic characteristics inherent in rotor systems, the vibration images containing both spatial and temporal information are generated and utilized. The effectiveness of DC-CNN is validated using experimental data from a rotor testbed. The experimental results indicate that the proposed method outperforms other conventional approaches with a smaller number of parameters. Also, visualizations of the learned features indicate that the proposed method can learn much more effective and significant features. Furthermore, the proposed method outperforms other approaches under conditions of insufficient or noisy data.