The purpose of this study was to investigate comparability of three output prediction models for a compact double‐scattered proton therapy system. Two published output prediction models are commissioned for our Mevion S250 proton therapy system. Model A is a correction‐based model (Sahoo et al., Med Phys, 2008;35(11):5088–5097) and model B is an analytical model which employs a function of r = (R’‐M’)/M’ (Kooy et al., Phys Med Biol, 2005;50:5487–5456) where R’ is defined as depth of distal 100% dose with straggling and M’ is the width between distal 100% dose and proximal 100% dose with straggling instead of the theoretical definition due to more accurate output prediction. The r is converted to ((R‐0.31)‐0.81 × M)/(0.81 × M) with the vendor definition of R (distal 90% dose) and M (distal 90% dose‐to‐proximal 95% dose), where R’ = R‐0.31 (g cm−2) and M’ = 0.81 × M (g cm−2). In addition, a quartic polynomial fit model (model C) mathematically converted from model B is studied. The outputs of 272 sets of R and M covering the 24 double scattering options are measured. Each model's predicted output is compared to the measured output. For the total dataset, the percent difference between predicted (P) and measured (M) outputs ((P‐M)/M × 100%) were within ±3% using the three different models. The average differences (±standard deviation) were −0.13 ± 0.94%, −0.13 ± 1.20%, and −0.22 ± 1.11% for models A, B, and C, respectively. The p‐values of the t‐test were 0.912 (model A vs. B), 0.061 (model A vs. C), and 0.136 (model B vs. C). For all the options, all three models have clinically acceptable predictions. The differences between models A, B, and C are statistically insignificant; however, model A generally has the potential to more accurately predict the output if a larger dataset for commissioning is used. It is concluded that the models can be comparably used for the compact proton therapy system.