Deep learning approaches have demonstrated significant progress in breast cancer histopathological image diagnosis. Training an interpretable diagnosis model using high-resolution histopathological image is still challenging. To alleviate this problem, a novel multi-view attention-guided multiple instance detection network (MA-MIDN) is proposed. The traditional image classification problem is framed as a weakly supervised multiple instance learning (MIL) problem. We first divide each histopathology image into instances and form a corresponding bag to fully utilize high-resolution information through MIL. Then a new multiple-view attention (MVA) algorithm is proposed to learn attention on the instances from the image to localize the lesion regions in this image. A MVA-guided MIL pooling strategy is designed for aggregating instance-level features to obtain bag-level features for the final classification. The proposed MA-MIDN model performs lesion localization and image classification, simultaneously. Particularly, we train the MA-MIDN model under the deep mutual learning (DML) schema. This transfers DML to a weakly supervised learning problem. Three public breast cancer histopathological image datasets are chosen to evaluate classification and localization results. The experimental results demonstrate that the MA-MIDN model is superior to the latest baselines in terms of diagnosis accuracy, AUC, Precision, Recall, and F1. Notably, it achieves better localization results without compromising classification performance, thereby proving its higher practicality. The code for the MA-MIDN model is available at https://github.com/lcxlcx/MA-MIDN.INDEX TERMS breast cancer diagnosis; multiple instance learning; multi-view attention; diagnosis interpretability; deep mutual learning.