PurposeTo evaluate accuracy for 2 deformable image registration methods (in-house B-spline and MIM freeform) using image pairs exhibiting changes in patient orientation and lung volume and to assess the appropriateness of registration accuracy tolerances proposed by the American Association of Physicists in Medicine Task Group 132 under such challenging conditions via assessment by expert observers.Methods and MaterialsFour-dimensional computed tomography scans for 12 patients with lung cancer were acquired with patients in prone and supine positions. Tumor and organs at risk were delineated by a physician on all data sets: supine inhale (SI), supine exhale, prone inhale, and prone exhale. The SI image was registered to the other images using both registration methods. All SI contours were propagated using the resulting transformations and compared with physician delineations using Dice similarity coefficient, mean distance to agreement, and Hausdorff distance. Additionally, propagated contours were anonymized along with ground-truth contours and rated for quality by physician-observers.ResultsAveraged across all patients, the accuracy metrics investigated remained within tolerances recommended by Task Group 132 (Dice similarity coefficient >0.8, mean distance to agreement <3 mm). MIM performed better with both complex (vertebrae) and low-contrast (esophagus) structures, whereas the in-house method performed better with lungs (whole and individual lobes). Accuracy metrics worsened but remained within tolerances when propagating from supine to prone; however, the Jacobian determinant contained regions with negative values, indicating localized nonphysiologic deformations. For MIM and in-house registrations, 50% and 43.8%, respectively, of propagated contours were rated acceptable as is and 8.2% and 11.0% as clinically unacceptable.ConclusionsThe deformable image registration methods performed reliably and met recommended tolerances despite anatomically challenging cases exceeding typical interfraction variability. However, additional quality assurance measures are necessary for complex applications (eg, dose propagation). Human review rather than unsupervised implementation should always be part of the clinical registration workflow.