PurposeIn recent years, the convolutional neural network (CNN) based deep learning approach has succeeded in data-mining the relationship between microstructures and macroscopic properties of materials. However, such CNN models usually rely heavily on a large set of labeled images to ensure the accuracy and generalization ability of the predictive models. Unfortunately, in many fields, acquiring image data is expensive and inconvenient. This study aims to propose a data augmentation technique to enhance the performance of the CNN models for linking microstructural images to the macroscopic properties of composites.Design/methodology/approachMicrostructures of composites are synthesized using discrete element simulations and Potts kinetic Monte Carlo simulations. Macroscopic properties such as the elastic modulus, Poisson's ratio, shear modulus, coefficient of thermal expansion, and triple-phase boundary length density are extracted on representative volume elements. The CNN model is trained using the 3D microstructural images as inputs and corresponding macroscopic properties as the labels. The comparison of the predictive performance of the CNN models with and without data augmentation treatment are compared.FindingsThe comparison between the prediction performance of CNN models with and without data augmentation showed that the former reduced the weighted mean absolute percentage error (WMAPE) for the prediction from 5.1627% to 1.7014%. This significant reduction signifies that the proposed data augmentation method can effectively enhance the generalization ability and robustness of CNN models.Originality/valueThis study demonstrates that data augmentation is beneficial for solving the problems of model overfitting, data scarcity, and sample imbalance for CNN-based deep learning tasks at a low cost. By developing more and advanced data augmentation techniques, deep learning accelerated homogenization will boost the multi-scale computational mechanics and materials.