IntroductionThe precise detection of vegetation in urban parks is crucial for accurate carbon sink calculations and planting assessments, particularly in high-density cities. Unlike traditional methods designed for forest vegetation, the detection and classification of urban park vegetation face challenges such as unclear boundaries, multiple vegetation categories, low image resolution, labor-intensive manual calculations, and unreliable modeling results. However, by utilizing unmanned aerial vehicles (UAVs) equipped with high-resolution visible and multispectral (MS) remote sensing cameras, it becomes possible to label images with green normalized difference vegetation index (GNDVI) and full-spectral three-channel information.MethodsBy employing a dual attention convolutional neural network (DANet) model that incorporates image fusion, DANet, and feature decoding networks, the high-precision detection of urban park vegetation can be significantly improved.ResultsEmpirical validation carried out in Jinhai Park since 2021 has provided evidence of the effectiveness of the DANet model when utilizing early fusion and feature fusion techniques. This model achieves an accurate detection rate of 88.6% for trees, 92.0% for shrubs, 92.6% for ground cover, and 91.8% for overall vegetation. These detection rates surpass those achieved using only visible images (88.7%) or GNDVI images (86.6%).DiscussionThe enhanced performance can be attributed to the intelligent capabilities of the double-in network. This high-precision detection model provides more precise scientific and technical support for subsequent park carbon sink calculations, assessments of existing vegetation for planting designs, and evaluations of urban ecological impacts.