The monitoring of bird populations can provide important information on the state of sensitive ecosystems; however, the manual collection of reliable population data is labour-intensive, time-consuming, and potentially error prone. Automated monitoring using computer vision is therefore an attractive proposition, which could facilitate the collection of detailed data on a much larger scale than is currently possible.A number of existing algorithms are able to classify bird species from individual high quality detailed images often using manual inputs (such as a priori parts labelling). However, deployment in the field necessitates fully automated in-flight classification, which remains an open challenge due to poor image quality, high and rapid variation in pose, and similar appearance of some species. We address this as a fine-grained classification problem, and have collected a video dataset of thirteen bird classes (ten species and another with three colour variants) for training and evaluation. We present our proposed algorithm, which selects effective features from a large pool of appearance and motion features. We compare our method to others which use appearance features only, including image classification using state-ofthe-art Deep Convolutional Neural Networks (CNNs). Using our algorithm we achieved an 90% correct classification rate, and we also show that using effectively selected motion and appearance features together can produce results which outperform state-of-the-art single image classifiers. We also show that the most significant motion features improve correct classification rates by 7% compared to using appearance features alone.