While traditional deep learning models achieve high accuracy in predicting tool wear under consistent working conditions, actual production processes frequently involve varying conditions due to different processing methods. The wear data of different working conditions have a large difference in distribution, so that the wear signal of milling cutter trained in one working condition can only predict the wear value of the working condition, which will cause a large waste of material and manpower for actual production. Therefore, a domain generalization method utilizing Wide Deep Convolutional Neural Networksand weighted antagonistic multi-source domain generalization is introduced for tool wear prediction, and MDG-WDCNNWAL model is constructed. After filtering and de-noising the original data, it is input into WDCNN for automatic feature learning, and the resulting multi-source feature signals are input into the prediction model and discriminator respectively to obtain corresponding loss values. Furthermore, wasserstein distance is used to measure the probability distribution distance of multi-source feature signals as a weighted value combined with the discriminator loss function, and the final loss value is obtained by adding together. Finally, the tool wear prediction model is set up using a backpropagation neural network. Various milling wear experimental data along with the NASA tool wear public dataset were utilized to assess the predictive performance of the trained model. The model achieves a high level of generalization in tool wear prediction, with average RMSE of 0.0599 and 0.2075, as well as average R2 of 0.9037 and 0.9196 on the NASA and self-collecting datasets respectively. To validate its exceptional precision in generalization performance, comparative experiments were conducted with other data-driven methods resulting in over a 50% reduction in average RMSE and more than a 40% increase in average R2.