Deep learning (DL) has shown promising performances in various remote sensing applications as a powerful tool. To explore the great potential of DL in improving the accuracy of individual tree species (ITS) classification, four convolutional neural network models (ResNet-18, ResNet-34, ResNet-50, and DenseNet-40) were employed to classify four tree species using the combined high-resolution satellite imagery and airborne LiDAR data. A total of 1503 samples of four tree species, including maple, pine, locust, and spruce, were used in the experiments. When both WorldView-2 and airborne LiDAR data were used, the overall accuracies (OA) obtained by ResNet-18, ResNet-34, ResNet-50, and DenseNet-40 were 90.9%, 89.1%, 89.1%, and 86.9%, respectively. The OA of ResNet-18 was increased by 4.0% and 1.8% compared with random forest (86.7%) and support vector machine (89.1%), respectively. The experimental results demonstrated that the size of input images impacted on the classification accuracy of ResNet-18. It is suggested that the input size of ResNet models can be determined according to the maximum size of all tree crown sample images. The use of LiDAR intensity image was helpful in improving the accuracies of ITS classification and atmospheric correction is unnecessary when both pansharpened WorldView-2 images and airborne LiDAR data were used.