Obtaining accurate forest coverage of tree species is an important basis for the rational use and protection of existing forest resources. However, most current studies have mainly focused on broad tree classification, such as coniferous vs. broadleaf tree species, and a refined tree classification with tree species information is urgently needed. Although airborne LiDAR data or unmanned aerial vehicle (UAV) images can be used to acquire tree information even at the single tree level, this method will encounter great difficulties when applied to a large area. Therefore, this study takes the eastern regions of the Qilian Mountains as an example to explore the possibility of tree species classification with satellite-derived images. We used Sentinel-2 images to classify the study area’s major vegetation types, particularly four tree species, i.e., Sabina przewalskii (S.P.), Picea crassifolia (P.C.), Betula spp. (Betula), and Populus spp. (Populus). In addition to the spectral features, we also considered terrain and texture features in this classification. The results show that adding texture features can significantly increase the separation between tree species. The final classification result of all categories achieved an accuracy of 86.49% and a Kappa coefficient of 0.83. For trees, the classification accuracy was 90.31%, and their producer’s accuracy (PA) and user’s (UA) were all higher than 84.97%. We found that altitude, slope, and aspect all affected the spatial distribution of these four tree species in our study area. This study confirms the potential of Sentinel-2 images for the fine classification of tree species. Moreover, this can help monitor ecosystem biological diversity and provide references for inventory estimation.