As deep learning (DL) models have been successfully applied to various image processing tasks, DL models, particularly convolutional neural networks (CNN), have been introduced into the geosciences to assist geologists in faster seismic interpretation. However, the generalization of DL-based fault interpretation is a challenge. When applied to seismic data with different characteristics, their performance degrades significantly. Several recent studies have proposed transfer learning techniques, in which similar but different source tasks are assumed to benefit the target task. However, it is unclear which source datasets would be most beneficial for this particular task (i.e. fault interpretation). In this paper, we first demonstrate through a systematic literature review that synthetic seismic datasets are the most popular source datasets in this area. Further, previous studies have not compared them with other types of datasets. Then, we demonstrate experimentally that the choice of source dataset should be influenced by the amount of annotation available in the target dataset. In addition, normalization appears to be an essential factor in fine-tuning techniques, particularly when interpreting faults. Finally, state-of-the-art performance was achieved on the ThebeFault dataset (0.