This study aims to verify the wood classification performance of convolutional neural networks (CNNs), such as VGG16, ResNet50, GoogLeNet, and basic CNN architectures, and to investigate the factors affecting classification performance. A dataset from 10 softwood species consisted of 200 cross-sectional micrographs each from the total part, earlywood, and latewood of each species. We used 80% and 20% of each dataset for training and testing, respectively. To improve the performance of the architectures, the dataset was augmented, and the differences in classification performance before and after augmentation were compared. The four architectures showed a high classification accuracy of over 90% between species, and the accuracy increased with increasing epochs. However, the starting points of the accuracy, loss, and training speed increments differed according to the architecture. The latewood dataset showed the highest accuracy. The epochs and augmented datasets also positively affected accuracy, whereas the total part and non-augmented datasets had a negative effect on accuracy. Additionally, the augmented dataset tended to derive stable results and reached a convergence point earlier. In the present study, an augmented latewood dataset was the most important factor affecting classification performance and should be used for training CNNs.