Recently, numerous data-driven fault diagnosis methods have been developed, and the tasks involving the same distribution of training and test data have been well solved. However, considering the particularity of gas-insulated switchgear (GIS), collecting massive data, especially with the same distribution, is difficult. Therefore, existing fault diagnosis methods hardly achieve satisfactory insulation defect diagnosis with small datasets. Aiming at solving this problem, a novel domain adversarial transfer convolutional neural network (DATCNN) is proposed, realising the diagnosis of GIS insulation defects on small samples. First, a residual CNN is built to learn feature representations from the source and target domains. Second, the domain adversarial training strategy is used for feature transfer, where a conditional adversarial mechanism is introduced, and the joint distribution of features and labels is improved to a random linear combination, which realises the simultaneous adaptation of features and labels. Finally, the Nesterov accelerated gradient descent optimisation algorithm is used to speed up the gradient convergence. DATCNN has 99.15% and ≥89.5% diagnosis accuracy for GIS insulation defects in the laboratory and on-site, respectively. Comprehensive experiment results show the effectiveness and superiority of the proposed method in diagnosing GIS insulation defects with small samples.