Quantitative validation of deformable image registration (DIR) algorithms is extremely difficult because of the complexity involved in constructing a deformable phantom that can duplicate various clinical scenarios. The purpose of this study is to describe a framework to test the accuracy of DIR based on computational modeling and evaluating using inverse consistency and other methods. Three clinically relevant organ deformations were created in prostate (distended rectum and rectal gas), head and neck (large neck flexion), and lung (inhale and exhale lung volumes with variable contrast enhancement) study sets. DIR was performed using both B‐spline and diffeomorphic demons algorithms in the forward and inverse direction. A compositive accumulation of forward and inverse deformation vector fields was done to quantify the inverse consistency error (ICE). The anatomical correspondence of tumor and organs at risk was quantified by comparing the original RT structures with those obtained after DIR. Further, the physical characteristics of the deformation field, namely the Jacobian and harmonic energy, were computed to quantify the preservation of image topology and regularity of spatial transformation obtained in DIR. The ICE was comparable in prostate case but the B‐spline algorithm had significantly better anatomical correspondence for rectum and prostate than diffeomorphic demons algorithm. The ICE was 6.5 mm for demons algorithm for head and neck case when compared to 0.7 mm for B‐spline. Since the induced neck flexion was large, the average Dice similarity coefficient between both algorithms was only 0.87, 0.52, 0.81, and 0.67 for tumor, cord, parotids, and mandible, respectively. The B‐spline algorithm accurately estimated deformations between images with variable contrast in our lung study, while diffeomorphic demons algorithm led to gross errors on structures affected by contrast variation. The proposed framework offers the application of known deformations on any image datasets, to evaluate the overall accuracy and limitations of a DIR algorithm used in radiation oncology. The evaluation based on anatomical correspondence, physical characteristics of deformation field, and image characteristics can facilitate DIR verification with the ultimate goal of implementing adaptive radiotherapy. The suitability of application of a particular evaluation metric in validating DIR is dependent on the clinical deformation observed.PACS numbers: 87.57 nj, 87.55‐x,87.55 Qr