BackgroundThis cohort study examined the impact of the lengths of lookback and confirmation periods as well as the definition of confirmatory events on the number of incident cancer cases identified and age-standardized cumulative incidences (ACI) estimated in administrative data using German cancer registry data as a benchmark.MethodsACI per 100,000 insured persons for breast, prostate and colorectal cancer were estimated using BARMER Statutory Health Insurance claims data. Incident cancer cases were defined as having an in- or outpatient diagnosis in 2013, no diagnosis in a lookback period of 1 year and a second diagnosis (or death) in a confirmation period of 1 quarter. We varied lookback periods from 1 to 7 years, confirmation periods from 1 to 4 quarters as well as the definition of confirmatory events and compared ACI estimates to cancer registry data.ResultsACI were higher for breast (138.7) and prostate (103.6) but lower for colorectal cancer (42.1) when compared to cancer registries (119.3, 98.0 and 45.5, respectively). Extending the lookback period to 7 years reduced ACI to 129.0, 95.1 and 38.3. An extended confirmation period of 4 quarters increased ACI to 151.3, 114.9 and 46.8. Including breast and colorectal surgeries as a confirmatory event reduced ACI to 114.9 and 37.1, respectively.ConclusionsThe choice of lookback and confirmation periods and the definition of confirmatory events have considerable impact on the number of incident cancer cases identified and ACI estimated. Researchers need to be aware of potential misclassification when identifying incident cancer cases in administrative data. Further validation studies as well as studies using administrative data to estimate cancer incidences should consider several choices of the lookback and confirmation periods and the definition of confirmatory events to show how these parameters impact the validity and robustness of their results.Electronic supplementary materialThe online version of this article (doi:10.1186/s12874-017-0407-4) contains supplementary material, which is available to authorized users.