Background
The combination of computer vision devices such as multispectral cameras coupled with artificial intelligence has provided a major leap forward in image-based analysis of biological processes. Supervised artificial intelligence algorithms require large ground truth image datasets for model training, which allows to validate or refute research hypotheses and to carry out comparisons between models. However, public datasets of images are scarce and ground truth images are surprisingly few considering the numbers required for training algorithms.
Results
We created a dataset of 1,283 multidimensional arrays, using berries from five different grape varieties. Each array has 37 images of wavelengths between 488.38 and 952.76 nm obtained from single berries. Coupled to each multispectral image, we added a dataset with measurements including, weight, anthocyanin content, and Brix index for each independent grape. Thus, the images have paired measures, creating a ground truth dataset. We tested the dataset with 2 neural network algorithms: multilayer perceptron (MLP) and 3-dimensional convolutional neural network (3D-CNN). A perfect (100% accuracy) classification model was fit with either the MLP or 3D-CNN algorithms.
Conclusions
This is the first public dataset of grape ground truth multispectral images. Associated with each multispectral image, there are measures of the weight, anthocyanins, and Brix index. The dataset should be useful to develop deep learning algorithms for classification, dimensionality reduction, regression, and prediction analysis.