Groundwater contamination induced by anthropogenic activities has long been a global issue. Characterizing and modeling contaminant transport processes is crucial to groundwater protection and management. However, challenges still exist in process complexity, data constraint, and computational cost. In the era of big data, the growth of machine learning has led to new opportunities in studying contaminant transport in groundwater systems. In this work, we introduce a new attention‐based graph neural network (aGNN) for modeling contaminant transport with limited monitoring data and quantifying causal connections between contaminant sources (drivers) and their spreading (outcomes). In five synthetic case studies that involve varying monitoring networks in heterogeneous aquifers, aGNN is shown to outperform LSTM‐based (long‐short term memory) and CNN‐ based (convolutional neural network) methods in multistep predictions (i.e., transductive learning). It also demonstrates a high level of applicability in inferring observations for unmonitored sites (i.e., inductive learning). Furthermore, an explanatory analysis based on aGNN quantifies the influence of each contaminant source, which has been validated by a physics‐based model with consistent outcomes with an R2 value exceeding 92%. The major advantage of aGNN is that it not only has a high level of predictive power in multiple scenario evaluations but also substantially reduces computational cost. Overall, this study shows that aGNN is efficient and robust for highly nonlinear spatiotemporal learning in subsurface contaminant transport, and provides a promising tool for groundwater management involving contaminant source attribution.