Tomato cultivation is expanding rapidly, but the tomato sector faces significant challenges from various sources, including environmental (abiotic stress) and biological (biotic stress or disease) threats, which adversely impact the crop’s growth, reproduction, and overall yield potential. The objective of this work is to build deep learning based lightweight convolutional neural network (CNN) architecture for the real-time classification of biotic stress in tomato plant leaves. This model proposes to address the drawbacks of conventional CNNs, which are resource-intensive and time-consuming, by using optimization methods that reduce processing complexity and enhance classification accuracy. Traditional plant disease classification methods predominantly utilize CNN based deep learning techniques, originally developed for fundamental image classification tasks. It relies on computationally intensive CNNs, hindering real-time application due to long training times. To address this, a lighter CNN framework is proposed to enhance with two key components. Firstly, an Elephant Herding Optimization (EHO) algorithm selects pertinent features for classification tasks. The classification module integrates a Hessian-based Optimal Brain Surgeon (HOBS) approach with a pruned Extreme Learning Machine (ELM), optimizing network parameters while reducing computational complexity. The proposed pruned model gives an accuracy of 95.73%, Cohen’s kappa of 0.81%, training time of 2.35sec on Plant Village dataset, comprising 8,000 leaf images across 10 distinct classes of tomato plant, which demonstrates that this framework effectively reduces the model’s size of 9.2Mb and parameters by reducing irrelevant connections in the classification layer. The proposed classifier performance was compared to existing deep learning models, the experimental results show that the pruned DenseNet achieves an accuracy of 86.64% with a model size of 10.6 MB, while GhostNet reaches an accuracy of 92.15% at 10.9 MB. CACPNET demonstrates an accuracy of 92.4% with a model size of 18.0 MB. In contrast, the proposed approach significantly outperforms these models in terms of accuracy and processing time.