BackgroundPathological complete response (pCR) to neoadjuvant chemotherapy (NAC) has demonstrated a strong correlation to improved survival in breast cancer (BC) patients. However, pCR rates to NAC are less than 30%, depending on the BC subtype. Early prediction of NAC response would facilitate therapeutic modifications for individual patients, potentially improving overall treatment outcomes and patient survival.PurposeThis study, for the first time, proposes a hierarchical self‐attention‐guided deep learning framework to predict NAC response in breast cancer patients using digital histopathological images of pre‐treatment biopsy specimens.MethodsDigitized hematoxylin and eosin‐stained slides of BC core needle biopsies were obtained from 207 patients treated with NAC, followed by surgery. The response to NAC for each patient was determined using the standard clinical and pathological criteria after surgery. The digital pathology images were processed through the proposed hierarchical framework consisting of patch‐level and tumor‐level processing modules followed by a patient‐level response prediction component. A combination of convolutional layers and transformer self‐attention blocks were utilized in the patch‐level processing architecture to generate optimized feature maps. The feature maps were analyzed through two vision transformer architectures adapted for the tumor‐level processing and the patient‐level response prediction components. The feature map sequences for these transformer architectures were defined based on the patch positions within the tumor beds and the bed positions within the biopsy slide, respectively. A five‐fold cross‐validation at the patient level was applied on the training set (144 patients with 9430 annotated tumor beds and 1,559,784 patches) to train the models and optimize the hyperparameters. An unseen independent test set (63 patients with 3574 annotated tumor beds and 173,637 patches) was used to evaluate the framework.ResultsThe obtained results on the test set showed an AUC of 0.89 and an F1‐score of 90% for predicting pCR to NAC a priori by the proposed hierarchical framework. Similar frameworks with the patch‐level, patch‐level + tumor‐level, and patch‐level + patient‐level processing components resulted in AUCs of 0.79, 0.81, and 0.84 and F1‐scores of 86%, 87%, and 89%, respectively.ConclusionsThe results demonstrate a high potential of the proposed hierarchical deep‐learning methodology for analyzing digital pathology images of pre‐treatment tumor biopsies to predict the pathological response of breast cancer to NAC.