Computer vision (CV) combined with a deep convolutional neural network (CNN) has emerged as a reliable analytical method to effectively characterize and quantify high-throughput phenotyping of different grain crops, including rice, wheat, corn, and soybean. In addition to the ability to rapidly obtain information on plant organs and abiotic stresses, and the ability to segment crops from weeds, such techniques have been used to detect pests and plant diseases and to identify grain varieties. The development of corresponding imaging systems to assess the phenotypic parameters, yield, and quality of crop plants will increase the confidence of stakeholders in grain crop cultivation, thereby bringing technical and economic benefits to advanced agriculture. Therefore, this paper provides a comprehensive review of CNNs in computer vision for grain crop phenotyping. It is meaningful to provide a review as a roadmap for future research in such a thriving research area. The CNN models (e.g., VGG, YOLO, and Faster R-CNN) used CV tasks including image classification, object detection, semantic segmentation, and instance segmentation, and the main results of recent studies on crop phenotype detection are discussed and summarized. Additionally, the challenges and future trends of the phenotyping techniques in grain crops are presented.