BackgroundPositron emission tomography (PET) has been investigated for its ability to reconstruct proton‐induced positron activity distributions in proton therapy. This technique holds potential for range verification in clinical practice. Recently, deep learning‐based dose estimation from positron activity distributions shows promise for in vivo proton dose monitoring and guided proton therapy.PurposeThis study evaluates the effectiveness of three classical neural network models, recurrent neural network (RNN), U‐Net, and Transformer, for proton dose estimating. It also investigates the characteristics of these models, providing valuable insights for selecting the appropriate model in clinical practice.MethodsProton dose calculations for spot beams were simulated using Geant4. Computed tomography (CT) images from four head cases were utilized, with three for training neural networks and the remaining one for testing. The neural networks were trained with one‐dimensional (1D) positron activity distributions as inputs and generated 1D dose distributions as outputs. The impact of the number of training samples on the networks was examined, and their dose prediction performance in both homogeneous brain and heterogeneous nasopharynx sites was evaluated. Additionally, the effect of positron activity distribution uncertainty on dose prediction performance was investigated. To quantitatively evaluate the models, mean relative error (MRE) and absolute range error (ARE) were used as evaluation metrics.ResultsThe U‐Net exhibited a notable advantage in range verification with a smaller number of training samples, achieving approximately 75% of AREs below 0.5 mm using only 500 training samples. The networks performed better in the homogeneous brain site compared to the heterogeneous nasopharyngeal site. In the homogeneous brain site, all networks exhibited small AREs, with approximately 90% of the AREs below 0.5 mm. The Transformer exhibited the best overall dose distribution prediction, with approximately 92% of MREs below 3%. In the heterogeneous nasopharyngeal site, all networks demonstrated acceptable AREs, with approximately 88% of AREs below 3 mm. The Transformer maintained the best overall dose distribution prediction, with approximately 85% of MREs below 5%. The performance of all three networks in dose prediction declined as the uncertainty of positron activity distribution increased, and the Transformer consistently outperformed the other networks in all cases.ConclusionsBoth the U‐Net and the Transformer have certain advantages in the proton dose estimation task. The U‐Net proves well suited for range verification with a small training sample size, while the Transformer outperforms others at dose‐guided proton therapy.