BackgroundVessel‐wall volume and localized three‐dimensional ultrasound (3DUS) metrics are sensitive to the change of carotid atherosclerosis in response to medical/dietary interventions. Manual segmentation of the media‐adventitia boundary (MAB) and lumen‐intima boundary (LIB) required to obtain these metrics is time‐consuming and prone to observer variability. Although supervised deep‐learning segmentation models have been proposed, training of these models requires a sizeable manually segmented training set, making larger clinical studies prohibitive.PurposeWe aim to develop a method to optimize pre‐trained segmentation models without requiring manual segmentation to supervise the fine‐tuning process.MethodsWe developed an adversarial framework called the unsupervised shape‐and‐texture generative adversarial network (USTGAN) to fine‐tune a convolutional neural network (CNN) pre‐trained on a source dataset for accurate segmentation of a target dataset. The network integrates a novel texture‐based discriminator with a shape‐based discriminator, which together provide feedback for the CNN to segment the target images in a similar way as the source images. The texture‐based discriminator increases the accuracy of the CNN in locating the artery, thereby lowering the number of failed segmentations. Failed segmentation was further reduced by a self‐checking mechanism to flag longitudinal discontinuity of the artery and by self‐correction strategies involving surface interpolation followed by a case‐specific tuning of the CNN. The U‐Net was pre‐trained by the source dataset involving 224 3DUS volumes with 136, 44, and 44 volumes in the training, validation and testing sets. The training of USTGAN involved the same training group of 136 volumes in the source dataset and 533 volumes in the target dataset. No segmented boundaries for the target cohort were available for training USTGAN. The validation and testing of USTGAN involved 118 and 104 volumes from the target cohort, respectively. The segmentation accuracy was quantified by Dice Similarity Coefficient (DSC), and incorrect localization rate (ILR). Tukey's Honestly Significant Difference multiple comparison test was employed to quantify the difference of DSCs between models and settings, where was considered statistically significant.ResultsUSTGAN attained a DSC of % in LIB and % in MAB, improving from the baseline performance of % in LIB (p ) and % in MAB (p ). Our approach outperformed six state‐of‐the‐art domain‐adaptation models (MAB: , LIB: ). The proposed USTGAN also had the lowest ILR among the methods compared (LIB: 2.5%, MAB: 1.7%).ConclusionOur framework improves segmentation generalizability, thereby facilitating efficient carotid disease monitoring in multicenter trials and in clinics with less expertise in 3DUS imaging.