The current artificial intelligence (AI) models are still insufficient in multi-disease diagnosis for real-world data, which always present a long-tail distribution. To tackle this issue, a long-tail public dataset, “ChestX-ray14,” which involved fourteen (14) disease labels, was randomly divided into the train, validation, and test sets with ratios of 0.7, 0.1, and 0.2. Two pretrained state-of-the-art networks, EfficientNet-b5 and CoAtNet-0-rw, were chosen as the backbones. After the fully-connected layer, a final layer of 14 sigmoid activation units was added to output each disease’s diagnosis. To achieve better adaptive learning, a novel loss (
L
ours
) was designed, which coalesced reweighting and tail sample focus. For comparison, a pretrained ResNet50 network with weighted binary cross-entropy loss (
L
WBCE
) was used as a baseline, which showed the best performance in a previous study. The overall and individual areas under the receiver operating curve (AUROC) for each disease label were evaluated and compared among different models. Group-score-weighted class activation mapping (Group-CAM) is applied for visual interpretations. As a result, the pretrained CoAtNet-0-rw +
L
ours
showed the best overall AUROC of 0.842, significantly higher than ResNet50 +
L
WBCE
(AUROC: 0.811,
p
= 0.037). Group-CAM presented that the model could pay the proper attention to lesions for most disease labels (e.g., atelectasis, edema, effusion) but wrong attention for the other labels, such as pneumothorax; meanwhile, mislabeling of the dataset was found. Overall, this study presented an advanced AI diagnostic model achieving a significant improvement in the multi-disease diagnosis of chest X-rays, particularly in real-world data with challenging long-tail distributions.