To account for intra‐fractional tumor motion during dose delivery in radiotherapy, various treatment strategies are clinically implemented such as breathing‐adapted gating and irradiating the tumor during specific breathing phases. In this work, we present a comprehensive phantom‐based end‐to‐end test of breathing‐adapted gating utilizing surface guidance for use in particle therapy. A commercial dynamic thorax phantom was used to reproduce regular and irregular breathing patterns recorded by the GateRT respiratory monitoring system. The amplitudes and periods of recorded breathing patterns were analysed and compared to planned patterns (ground‐truth). In addition, the mean absolute deviations (MAD) and Pearson correlation coefficients (PCC) between the measurements and ground‐truth were assessed. Measurements of gated and non‐gated irradiations were also analysed with respect to dosimetry and geometry, and compared to treatment planning system (TPS). Further, the latency time of beam on/off was evaluated. Compared to the ground‐truth, measurements performed with GateRT showed amplitude differences between 0.03 ± 0.02 mm and 0.26 ± 0.03 mm for regular and irregular breathing patterns, whilst periods of both breathing patterns ranged with a standard deviation between 10 and 190 ms. Furthermore, the GateRT software precisely acquired breathing patterns with a maximum MAD of 0.30 ± 0.23 mm. The PCC constantly ranged between 0.998 and 1.000. Comparisons between TPS and measured dose profiles indicated absolute mean dose deviations within institutional tolerances of ±5%. Geometrical beam characteristics also varied within our institutional tolerances of 1.5 mm. The overall time delays were <60 ms and thus within both recommended tolerances published by ESTRO and AAPM of 200 and 100 ms, respectively. In this study, a non‐invasive optical surface‐guided workflow including image acquisition, treatment planning, patient positioning and gated irradiation at an ion‐beam gantry was investigated, and shown to be clinically viable. Based on phantom measurements, our results show a clinically‐appropriate spatial, temporal, and dosimetric accuracy when using surface guidance in the clinical setting, and the results comply with international and institutional guidelines and tolerances.