Plant disease identification is a crucial issue in agriculture, and with the advancement of deep learning techniques, early and accurate identification of plant diseases has become increasingly critical. In recent years, the rise of vision transformers has attracted significant attention from researchers in various vision-based application areas. We designed a model with an encoder–decoder architecture to efficiently classify plant diseases using a transfer learning approach, which effectively recognizes a large number of plant diseases in multiple crops. The model was tested on the “PlantVillage”, “FGVC8”, and “EMBRAPA” datasets, which contain leaf information from crops such as apples, soybeans, tomatoes, and potatoes. These datasets cover diseases caused by fungi, including rust, spot, and scab, as well as viral diseases such as leaf curl. The model’s performance was rigorously evaluated on datasets, and the results demonstrated its high accuracy. The model achieved 99.9% accuracy on the “PlantVillage” dataset, 97.4% on the “EMBRAPA” dataset, and 91.5% on the “FGVC8” dataset, showcasing its competitiveness with other state-of-the-art models. This study provides a robust and reliable solution for plant disease classification and contributes to the advancement of precision agriculture.