Detection of the early stages of stress is crucial in stabilizing crop yields and agricultural production. The aim of this study was to construct a nondestructive and robust method to predict the early physiological drought status of the tomato (Solanum lycopersicum); for this purpose, a convolutional neural network (CNN)-based model with a one-dimensional (1D) kernel for fitting the visible and near infrared (Vis/NIR) spectral data was proposed. To prevent degradation and enhance the feature comprehension of the deep neural network architecture, residual and global context modules were embedded in the proposed 1D-CNN model, yielding the 1D spectrogram power net (1D-SP-Net). The 1D-SP-Net outperformed the 1D-CNN, partial least squares discriminant analysis (PLSDA), and random forest (RF) models in model testing, demonstrating an accuracy of 96.3%, precision of 98.0%, Matthew’s correlation coefficient of 0.92, and an F1 score of 0.95. Furthermore, when employing various synthesized imbalanced data sets, the proposed 1D-SP-Net remained robust and consistent, outperforming the other models in terms of the prediction capabilities. These results indicate that the 1D-SP-Net is a promising model resistant to the effects of imbalanced data sets and able to determine the early drought stress status of tomato seedlings in a non-invasive manner.