Drones or quadcopters have been widely used in various fields based on deep learning, especially object detection. However, drone vision characteristics such as occlusion and small objects are still being explored for performance in terms of accuracy and speed detection. The YOLO architecture is very commonly used for cases requiring high-speed detection. To overcome the limitations of drone vision, in this paper, we explore the size of the YOLOv5s backbone kernel in the shallowest convolutional layer to achieve better performance. The kernel is a filter that has a main role in the feature map, and it defines the size of the convolution matrix, and the resulting features in the shallowest convolutional layer are more representative of the case of object detection and recognition. The techniques can be divided into three major categories: (1) data preprocessing, which involves augmentation and normalization of the data, (2) kernel size exploration in the shallowest convolutional layer of the YOLOv5s, and (3) model implementation in the real environment using the quadcopter. The dataset consisted of four classes representing dragon fruit, snake fruit, banana, and pineapple, with a total of 8000 data. Exploration results with kernel size give promising results. Kernel sizes 5 and 7 give an mAP of 0.988. Through these results, modification of the kernel size provides an opportunity for more in-depth investigations, such as with the epoch parameter, padding scheme, and other optimization techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.