With the development of deep learning and sensors and sensor collection methods, computer vision inspection technology has developed rapidly. The deep-learning-based classification algorithm requires the acquisition of a model with superior generalization capabilities through the utilization of a substantial quantity of training samples. However, due to issues such as privacy, annotation costs, and sensor-captured images, how to make full use of limited samples has become a major challenge for practical training and deployment. Furthermore, when simulating models and transferring them to actual image scenarios, discrepancies often arise between the common training sets and the target domain (domain offset). Currently, meta-learning offers a promising solution for few-shot learning problems. However, the quantity of supporting set data on the target domain remains limited, leading to limited cross-domain learning effectiveness. To address this challenge, we have developed a self-distillation and mixing (SDM) method utilizing a Teacher–Student framework. This method effectively transfers knowledge from the source domain to the target domain by applying self-distillation techniques and mixed data augmentation, learning better image representations from relatively abundant datasets, and achieving fine-tuning in the target domain. In comparison with nine classical models, the experimental results demonstrate that the SDM method excels in terms of training time and accuracy. Furthermore, SDM effectively transfers knowledge from the source domain to the target domain, even with a limited number of target domain samples.