The color of a bunch of grapes is a very important factor when determining the appropriate time for harvesting. However, judging whether the color of the bunch is appropriate for harvesting requires experience and the result can vary by individuals. In this paper, we describe a system to support grape harvesting based on color estimation using deep learning. To estimate the color of a bunch of grapes, bunch detection, grain detection, removal of pest grains, and color estimation are required, for which deep learning-based approaches are adopted. In this study, YOLOv5, an object detection model that considers both accuracy and processing speed, is adopted for bunch detection and grain detection. For the detection of diseased grains, an autoencoder-based anomaly detection model is also employed. Since color is strongly affected by brightness, a color estimation model that is less affected by this factor is required. Accordingly, we propose multitask learning that uses metric learning. The color estimation model in this study is based on AlexNet. Metric learning was applied to train this model. Brightness is an important factor affecting the perception of color. In a practical experiment using actual grapes, we empirically selected the best three image channels from RGB and CIELAB (L*a*b*) color spaces and we found that the color estimation accuracy of the proposed multi-task model, the combination with “L” channel from L*a*b color space and “GB” from RGB color space for the grape image (represented as “LGB” color space), was 72.1%, compared to 21.1% for the model which used the normal RGB image. In addition, it was found that the proposed system was able to determine the suitability of grapes for harvesting with an accuracy of 81.6%, demonstrating the effectiveness of the proposed system.