Purpose: Patients with early-stage lung cancer undergoing stereotactic ablative radiotherapy receive four-dimensional computed tomography (4D-CT) for treatment planning. Often, an internal gross target volume (iGTV), which approximates the motion envelope of a tumor over the breathing cycle, is delineated without defining a gross tumor volume (GTV). However, the GTV volume and shape are important parameters for prognostic and dose modelling, and there is interest in radiomic features extracted from the GTV and surrounding tissue. We demonstrate and validate a method to generate the GTV from an iGTV contour to aid retrospective analysis on routine data. Method: It is possible to reconstruct the geometry of a tumor with knowledge of tumor motion and the motion envelope formed during respiration. To demonstrate this, the tumor motion path was estimated with local rigid registration, and the iGTV positioned incrementally at stations along the reverse path. It is shown that the tumor volume is the largest set common to the intersection of the iGTV at these positions, hence can be derived. This was implemented for 521 lung lesions on 4D-CT. Eleven patients with a GTV delineation performed by a radiation oncologist on a reference phase (50%) were used for validation. The generated GTV was compared to that delineated by the expert using distance-to-agreement (DTA), volume, and distance between centres of mass. An overall success rate was determined by detecting registration inaccuracy and performing a quality check on the routine iGTV. For successfully generated contours, GTV volume was compared to iGTV volume in a prognostic model for overall survival. Results: For the validation dataset, DTA mean (0.79-1.55 mm) and standard deviation (0.68-1.51 mm) were comparable to expected observer variation. Difference in volume was < 5 cm 3 , and average difference in position was 1.21 mm. Deviations in shape and position were mainly caused by observer differences in iGTV and GTV interpretation as opposed to algorithm performance. For the complete dataset, an acceptable contour was generated for 94% of patients using statistical and visual assessment to detect failures. Generated GTV volumes improved prognostic model performance over iGTV volumes. Conclusion: A method to generate a GTV from an iGTV and 4D-CT dataset was developed. This method facilitates data analysis of patients with early-stage lung cancer treated in the routine setting, that is, data mining, prognostic modeling, and radiomics. Generation failure detection removes the need for visual assessment of all contours, reducing a time-consuming aspect of big-data analysis. Favorable prognostic performance of generated GTV volumes over iGTV ones demonstrates opportunities to use this methodology for future study.