Approaches based on near infrared hyperspectral imaging (NIR-HSI) technology combined with machine learning have been developed to classify the leaves of hybrid cherry tomatoes and then identify the species of hybrid cherry tomato plants. The near infrared (NIR) hyperspectral images of 400 cherry tomato leaves (100 per species) were collected in the wavelength range of 900–1700 nm. Machine learning algorithms such as linear discriminant analysis (LDA), random forest (RF), and support vector machine (SVM) were employed to construct leaf classification models with the hyperspectral data preprocessed by Savitzky-Golay (SG) smoothing filter, first derivative (first Der) and standard normal variate (SNV). Principle of Component Analysis (PCA) was also used to reduce the data dimension and extract spectral features. It is revealed that the LDA model reaches the highest classification accuracy among the three machine learning algorithms and SNV can lead to higher improvement in model accuracy than other preprocessing methods of SG smoothing and first Der. Analysis based on PCA spectral feature extraction demonstrates that differences occur in internal material content in the leaves of cherry tomato plants with different species, which renders the models being able to distinguish between the species. Another important work was performed to reveal the different effects of the mesophyll and vein regions (VR) on the accuracy of the leaf classification model. It is demonstrated that the classification accuracy is improved by a value of 0.033 or 0.042 when mesophyll substitutes vein or whole leaf as regions of interest (ROI) to extract reflectance spectra for modeling. As a result, the accuracy of the training and test set respectively reached a high value of 0.998 and 0.973 for the LDA classification model combined with the SNV preprocessing method. The results propose that the use of mesophyll region (MR) as ROI can improve the performance of the leaf classification model, which provides a new strategy for efficient and non-destructive classification of different hybrid cherry tomato plants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.