We introduce a new Procrustes-type method called matching component analysis to isolate components in data for transfer learning. Our theoretical results describe the sample complexity of this method, and we demonstrate through numerical experiments that our approach is indeed well suited for transfer learning.
IntroductionMany state-of-the-art classification algorithms require a large training set that is statistically similar to the test set. For example, deep learning-based approaches require a large number of representative samples in order to find near-optimal network weights and biases [4,9]. Similarly, template-based approaches require large dictionaries of training images so that each test image can be represented by an element of the dictionary [21,26,17,6]. For each technique, if test images cannot be represented in a feature space that has been determined from the training set, then classification accuracy is poor.In applications such as synthetic aperture radar (SAR) automatic target recognition (ATR), it is infeasible to collect the volume of data necessary to naively train high-accuracy classification networks. Additionally, due to varying operating conditions, the features measured in SAR imagery are different from those extracted from electro-optical (EO) imagery [13]. As such, off-the-shelf networks that have been pre-trained on the popular EObased ImageNet [3] or CIFAR-10 [8] datasets are insufficient for performing accurate ATR tasks in different imaging domains. In fact, recent work has demonstrated that pre-trained networks fail to effectively generalize to random perturbations on test sets [19,18]. To build more representative training sets, additional data are often generated using modeling and simulation software. However, due to various model errors, simulated data often misrepresent the real-world scattering observed in measured imagery. Thus, even though it is possible to augment training sets with a large amount of simulated data, the inherent differences in sensor modalities and data representations make modifying classification networks a non-trivial task [20].In this paper, we introduce matching component analysis (MCA) to help remedy this situation. Given a small number of images from the training domain and matching *