In the actual operation of high-speed trains, brake friction blocks work in a normal state at most times. In other words, most of the collected datasets on the uneven wear states are highly imbalanced. To address the issue of imbalanced data, a monitoring model based on a multi-head self-attention mechanism and a one-dimensional multi-scale convolutional neural network (MHSA–1DMCNN) is proposed, taking into account the correlation between multi-sources of data. First, multi-source friction interface parameters, such as vibration acceleration, braking noise, and friction coefficient, are collected as sample data to characterize the state of brake friction blocks. The Smote-Tomek method is used for comprehensive sampling of multi-source data. Then, a multi-head self-attention (MHSA) mechanism is utilized to extract important global information from multiple different representation subspaces. And 1DMCNNs are applied to extract friction block state features from multiple scales. Finally, MHSA is constructed to achieve the multi-source feature fusion for monitoring the uneven wear state. The experimental results show that the proposed model can maintain high recognition accuracy under different degrees of imbalance ratios. This provides a new feasible method for monitoring the uneven wear state of high-speed train brake pads under imbalanced data.