Crop disease detection and management is critical to improving productivity, reducing costs, and promoting environmentally friendly crop treatment methods. Modern technologies, such as data mining and machine learning algorithms, have been used to develop automated crop disease detection systems. However, centralized approach to data collection and model training induces challenges in terms of data privacy, availability, and transfer costs. To address these challenges, federated learning appears to be a promising solution. In this paper, we explored the application of federated learning for crop disease classification using image analysis. We developed and studied convolutional neural network (CNN) models and those based on attention mechanisms, in this case vision transformers (ViT), using federated learning, leveraging an open access image dataset from the "PlantVillage" platform. Experiments conducted concluded that the performance of models trained by federated learning is influenced by the number of learners involved, the number of communication rounds, the number of local iterations and the quality of the data. With the objective of highlighting the potential of federated learning in crop disease classification, among the CNN models tested, ResNet50 performed better in several experiments than the other models, and proved to be an optimal choice, but also the most suitable for a federated learning scenario. The ViT_B16 and ViT_B32 Vision Transformers require more computational time, making them less suitable in a federated learning scenario, where computational time and communication costs are key parameters. The paper provides a state-of-the-art analysis, presents our methodology and experimental results, and concludes with ideas and future directions for our research on using federated learning in the context of crop disease classification.