Breast cancer is a serious threat to women. Many machine learning-based computer-aided diagnosis (CAD) methods have been proposed for the early diagnosis of breast cancer based on histopathological images. Even though many such classification methods achieved high accuracy, many of them lack the explanation of the classification process. In this paper, we compare the performance of conventional machine learning (CML) against deep learning (DL)-based methods. We also provide a visual interpretation for the task of classifying breast cancer in histopathological images. For CML-based methods, we extract a set of handcrafted features using three feature extractors and fuse them to get image representation that would act as an input to train five classical classifiers. For DL-based methods, we adopt the transfer learning approach to the well-known VGG-19 deep learning architecture, where its pre-trained version on the large scale ImageNet, is block-wise fine-tuned on histopathological images. The evaluation of the proposed methods is carried out on the publicly available BreaKHis dataset for the magnification dependent classification of benign and malignant breast cancer and their eight sub-classes, and a further validation on KIMIA Path960, a magnification-free histopathological dataset with 20 image classes, is also performed. After providing the classification results of CML and DL methods, and to better explain the difference in the classification performance, we visualize the learned features. For the DL-based method, we intuitively visualize the areas of interest of the best fine-tuned deep neural networks using attention maps to explain the decision-making process and improve the clinical interpretability of the proposed models. The visual explanation can inherently improve the pathologist’s trust in automated DL methods as a credible and trustworthy support tool for breast cancer diagnosis. The achieved results show that DL methods outperform CML approaches where we reached an accuracy between 94.05% and 98.13% for the binary classification and between 76.77% and 88.95% for the eight-class classification, while for DL approaches, the accuracies range from 85.65% to 89.32% for the binary classification and from 63.55% to 69.69% for the eight-class classification.