Olive tree growing is an important economic activity in many countries, mostly in the Mediterranean Basin, Argentina, Chile, Australia, and California. Although recent intensification techniques organize olive groves in hedgerows, most olive groves are rainfed and the trees are scattered (as in Spain and Italy, which account for 50% of the world’s olive oil production). Accurate measurement of trees biovolume is a first step to monitor their performance in olive production and health. In this work, we use one of the most accurate deep learning instance segmentation methods (Mask R-CNN) and unmanned aerial vehicles (UAV) images for olive tree crown and shadow segmentation (OTCS) to further estimate the biovolume of individual trees. We evaluated our approach on images with different spectral bands (red, green, blue, and near infrared) and vegetation indices (normalized difference vegetation index—NDVI—and green normalized difference vegetation index—GNDVI). The performance of red-green-blue (RGB) images were assessed at two spatial resolutions 3 cm/pixel and 13 cm/pixel, while NDVI and GNDV images were only at 13 cm/pixel. All trained Mask R-CNN-based models showed high performance in the tree crown segmentation, particularly when using the fusion of all dataset in GNDVI and NDVI (F1-measure from 95% to 98%). The comparison in a subset of trees of our estimated biovolume with ground truth measurements showed an average accuracy of 82%. Our results support the use of NDVI and GNDVI spectral indices for the accurate estimation of the biovolume of scattered trees, such as olive trees, in UAV images.