Data on the distribution of tree species are often requested by forest managers, inventory agencies, foresters as well as private and municipal forest owners. However, the automated detection of tree species based on passive remote sensing data from aerial surveys is still not sufficiently developed to achieve reliable results independent of the phenological stage, time of day, season, tree vitality and prevailing atmospheric conditions. Here, we introduce a novel tree species classification approach based on high resolution RGB image data gathered during automated UAV flights that overcomes these insufficiencies. For the classification task, a computationally lightweight convolutional neural network (CNN) was designed. We show that with the chosen CNN model architecture, average classification accuracies of 92% can be reached independently of the illumination conditions and the phenological stages of four different tree species. We also show that a minimal ground sampling density of 1.6 cm/px is needed for the classification model to be able to make use of the spatial-structural information in the data. Finally, to demonstrate the applicability of the presented approach to derive spatially explicit tree species information, a gridded product is generated that yields an average classification accuracy of 88%.