Leaf numbers are vital in estimating the yield of crops. Traditional manual leaf-counting is tedious, costly, and an enormous job. Recent convolutional neural network-based approaches achieve promising results for rosette plants. However, there is a lack of effective solutions to tackle leaf counting for monocot plants, such as sorghum and maize. The existing approaches often require substantial training datasets and annotations, thus incurring significant overheads for labeling. Moreover, these approaches can easily fail when leaf structures are occluded in images. To address these issues, we present a new deep neural network-based method that does not require any effort to label leaf structures explicitly and achieves superior performance even with severe leaf occlusions in images. Our method extracts leaf skeletons to gain more topological information and applies augmentation to enhance structural variety in the original images. Then, we feed the combination of original images, derived skeletons, and augmentations into a regression model, transferred from Inception-Resnet-V2, for leaf-counting. We find that leaf tips are important in our regression model through an input modification method and a Grad-CAM method. The superiority of the proposed method is validated via comparison with the existing approaches conducted on a similar dataset. The results show that our method does not only improve the accuracy of leaf-counting, with overlaps and occlusions, but also lower the training cost, with fewer annotations compared to the previous state-of-the-art approaches.The robustness of the proposed method against the noise effect is also verified by removing the environmental noises during the image preprocessing and reducing the effect of the noises introduced by skeletonization, with satisfactory outcomes.