Because fruits are complex, automating their identification is a constant challenge. Manual fruit categorisation is a difficult task since fruit types and subtypes are often location-dependent. A sum of recent publications has classified the Fruit-360 dataset using methods based on Convolutional Neural Networks (e.g., VGG16, Inception V3, MobileNet, and ResNet18). Unfortunately, out of all 131 fruit classifications, none of them are extensive enough to be used. Furthermore, these models did not have the optimum computational efficiency. Here we propose a new, robust, and all-encompassing research that identifies and predicts the whole Fruit-360 dataset, which consists of 90,483 sample photos and 131 fruit classifications. The research gap was successfully filled using an algorithm that is based on the Modified AlexNet with an efficient classifier. The input photos are processed by the modified AlexNet, which uses the Golden jackal optimisation algorithm (GJOA) to choose the best tuning of the feature extraction technique. Lastly, the classifier employed is Fruit Shift Self Attention Transform Mechanism (FSSATM). This transform mechanism is aimed to improve the transformer's accuracy and comprises a spatial feature extraction module (SFE) besides spatial position encoding (SPE). Iterations and a confusion matrix were used to validate the algorithm. The outcomes prove that the suggested tactic yields a relative accuracy of 98%. Furthermore, state-of-the-art procedures for the drive were located in the literature and compared to the built system. By comparing the results, it is clear that the newly created algorithm is capable of efficiently processing the whole Fruit-360 dataset.