Accurate detection and segmentation of individual trees from unmanned aerial vehicle images is critical for forestry resource surveys and accurate forest management. Deep learning methods have been used for studies of individual tree crown segmentation, classification, and number of trees in mixed coniferous and broad-leaved forests, but the accuracy needs to be improved. Therefore, this study uses BlendMask, a simpler and more efficient algorithm that combines Mask R-CNN and Yolact algorithms to effectively combine instance-level information with semantic information at a finer granularity level, greatly improving crown segmentation accuracy and classification results. Three coniferous species and five broad-leaved species unmanned aerial vehicle images collected from the Jing Yue multispecies ecological forestry site in Changping District, Beijing, were used as the dataset, and the results were compared with Yolact and Mask R-CNN. The results show that the method described in this work has the highest Kappa coefficient (0.89) and overall accuracy (92.14%) in the test set. For segmentation accuracy, coniferous species' producer's accuracy was 0.91 to 0.95, whereas that of broadleaved species was 0.89 to 0.92. For species classification, the F1-score and mean average precision for coniferous species were greater than 91%, whereas those for broad-leaved species were 77.64% to 85.63%. The accuracy of extracting stand density in low and medium canopy density stands was 0.9909 and 0.9422, respectively, whereas that in high canopy density stands was 0.8913. This study shows that the BlendMask model has a good effect in studying the classification of multiple tree species, the segmentation of individual tree crowns, and the statistics of the number of trees in complex forest areas. Compared with broad-leaved forests and high canopy density stands, this model is more suitable for coniferous forest and medium and low canopy density stand scenarios. This study provides an important tool for obtaining more accurate species classification, canopy segmentation, and resource inventory results in complex forest areas.