Using vibration signals for bearing fault diagnosis can generally achieve good diagnostic results. However, it is not suitable for practical industrial applications due to the restricted installation and high cost of vibration sensors. Therefore, the easily obtainable motor current signal (MCS) has received widespread attention in recent years. Meanwhile, traditional fault diagnosis methods cannot meet the diagnostic accuracy requirements because of the low signal-to-noise ratio (SNR) of the MCS. Committed to achieving bearing fault diagnosis through MCS, a rolling bearing fault diagnosis method, ISCV-ViT, based on the MCS and the Vision Transformer (ViT) model, is proposed. In particular, a signal processing method based on the instantaneous square current value (ISCV) is proposed to process the MCS directly obtained through a frequency converter into time-domain images. Then, the ViT model is applied for bearing fault diagnosis. Finally, experimental verification is carried out based on the public bearing dataset of Paderborn University (PU) and the bearing dataset of Shenzhen Technology University (SZTU). The analysis of the experimental results demonstrates that the average accuracy of the ISCV-ViT for the two datasets is up to 96.60% and 94.87%, respectively.