Purpose: Develop a computer-aided detection (CAD) system for masses in digital breast tomosynthesis (DBT) volume using a deep convolutional neural network (DCNN) with transfer learning from mammograms. Methods: A data set containing 2282 digitized film and digital mammograms and 324 DBT volumes were collected with IRB approval. The mass of interest on the images was marked by an experienced breast radiologist as reference standard. The data set was partitioned into a training set (2282 mammograms with 2461 masses and 230 DBT views with 228 masses) and an independent test set (94 DBT views with 89 masses). For DCNN training, the region of interest (ROI) containing the mass (true positive) was extracted from each image. False positive (FP) ROIs were identified at prescreening by their previously developed CAD systems. After data augmentation, a total of 45 072 mammographic ROIs and 37 450 DBT ROIs were obtained. Data normalization and reduction of non-uniformity in the ROIs across heterogeneous data was achieved using a background correction method applied to each ROI. A DCNN with four convolutional layers and three fully connected (FC) layers was first trained on the mammography data. Jittering and dropout techniques were used to reduce overfitting. After training with the mammographic ROIs, all weights in the first three convolutional layers were frozen, and only the last convolution layer and the FC layers were randomly initialized again and trained using the DBT training ROIs. The authors compared the performances of two CAD systems for mass detection in DBT: one used the DCNN-based approach and the other used their previously developed feature-based approach for FP reduction. The prescreening stage was identical in both systems, passing the same set of mass candidates to the FP reduction stage. For the feature-based CAD system, 3D clustering and active contour method was used for segmentation; morphological, gray level, and texture features were extracted and merged with a linear discriminant classifier to score the detected masses. For the DCNN-based CAD system, ROIs from five consecutive slices centered at each candidate were passed through the trained DCNN and a mass likelihood score was generated. The performances of the CAD systems were evaluated using free-response ROC curves and the performance difference was analyzed using a non-parametric method. Results: Before transfer learning, the DCNN trained only on mammograms with an AUC of 0.99 classified DBT masses with an AUC of 0.81 in the DBT training set. After transfer learning with DBT, the AUC improved to 0.90. For breast-based CAD detection in the test set, the sensitivity for the feature-based and the DCNN-based CAD systems was 83% and 91%, respectively, at 1 FP/DBT volume. The difference between the performances for the two systems was statistically significant (p-value < 0.05). Conclusions: The image patterns learned from the mammograms were transferred to the mass detection on DBT slices through the DCNN. This study demonstrated that large data s...