Objective: 
Accurate dose calculations are essential prerequisites for precise radiotherapy. The integration of deep learning into dosimetry could consider computational accuracy and efficiency and has potential applicability to clinical dose calculation. The generalisation of a deep learning dose calculation method (hereinafter referred to as TERMA-Monte Carlo network, T-MC net) was evaluated in clinical practice using intensity-modulated radiotherapy (IMRT) plans for various human body regions and multiple institutions, with the Monte Carlo (MC) algorithm serving as a benchmark.
Approach: 
Sixty IMRT plans were selected from four institutions for testing the head and neck, chest and abdomen, and pelvis regions. Using the MC results as the benchmark, the T-MC net calculation results were used to perform three-dimensional dose distribution and dose-volume histogram (DVH) comparisons of the entire body, planning target volume (PTV) and organs at risk (OARs), respectively, and calculate the mean ± 95% confidence interval of gamma pass rate (GPR), percentage of agreement (PA) and dose difference ratio (DDR) of dose indices D95, D50, and D5.
Main results: 
For the entire body, the GPRs of 3%/3 mm, 2%/2 mm, 2%/1mm, and the PA were 99.62±0.32%, 98.50±1.09%, 95.60±2.90% and 97.80±1.12%, respectively. For the PTV, the GPRs of 3%/3 mm, 2%/2 mm, 2%/1mm and the PA were 98.90±1.00%, 95.78±2.83%, 92.23±4.74% and 98.93±0.62%, respectively. The absolute value of average DDR was less than 1.4%.
Significance: We proposed a general dose calculation framework based on deep learning, using the MC algorithm as a benchmark, performing a generalisation test for IMRT treatment plans across multiple institutions. The framework provides high computational speed while maintaining the accuracy of MC and may become an effective dose algorithm engine in treatment planning, adaptive radiotherapy, and dose verification.