Structural health monitoring and condition assessment of existing bridge decks is a growing challenge. Conventional manned inspections are costly, labor-intensive, and often risky to execute. Sub-surface delamination, a leading cause of deck replacement, can be autonomously and objectively detected using infrared thermography (IRT) data with developed deep learning AI models to address some of the limitations associated with manned inspection. As one of the most promising classifiers, deep convolutional neural networks (DCNNs) have not been utilized to their fullest potential for delamination detection, arguably due to the scarcity of realistic ground truth datasets. In this study, a common encoder–decoder semantic segmentation-based DCNN is adapted through domain adaptation. The model was tuned and trained on a publicly available dataset to detect subsurface delamination in IRT data collected from in-service bridge decks. The authors investigated the effect of dataset augmentation, class imbalance, the number of classes, and the effect of background removal in the training dataset, resulting in an overall number of seventy-five UNET models. Four out of five bridges were adopted for training and validation, and the fifth bridge was for testing. Most models averaged 80 iterations, and the training progress finally reached a training accuracy of 75% with a loss of about 0.6 without any overfitting. The result showed a substantial difference in the minimum and maximum values for the evaluated performance metrics (0.447 and 0.773 for global accuracy, 0.494 and 0.657 for mean accuracy, 0.239 and 0.716 for precision, 0.243 and 0.558 for true positive rate (TPR), 0.529 and 0.899 for true negative rate (TNR), 0.282 and 0.550 for F1-score. The results also indicated that the models trained on the raw annotated balanced dataset performed best for half of the metrics. In contrast, the models trained on raw data (with no dataset enhancement) performed better when only global accuracy was considered.