BackgroundCompared to very low gestational age (<32 weeks, VLGA) cohorts, very low birth weight (<1500 g; VLBW) cohorts are more prone to selection bias toward small-for-gestational age (SGA) infants, which may impact upon the validity of data for benchmarking purposes.MethodData from all VLGA or VLBW infants admitted in the 3 Networks between 2008 and 2011 were used. Two-thirds of each network cohort was randomly selected to develop prediction models for mortality and composite adverse outcome (CAO: mortality or cerebral injuries, chronic lung disease, severe retinopathy or necrotizing enterocolitis) and the remaining for internal validation. Areas under the ROC curves (AUC) of the models were compared.ResultsVLBW cohort (24,335 infants) had twice more SGA infants (20.4% vs. 9.3%) than the VLGA cohort (29,180 infants) and had a higher rate of CAO (36.5% vs. 32.6%). The two models had equal prediction power for mortality and CAO (AUC 0.83), and similarly for all other cross-cohort validations (AUC 0.81–0.85). Neither model performed well for the extremes of birth weight for gestation (<1500 g and ≥32 weeks, AUC 0.50–0.65; ≥1500 g and <32 weeks, AUC 0.60–0.62).ConclusionThere was no difference in prediction power for adverse outcome between cohorting VLGA or VLBW despite substantial bias in SGA population. Either cohorting practises are suitable for international benchmarking.Electronic supplementary materialThe online version of this article (doi:10.1186/s12887-017-0921-x) contains supplementary material, which is available to authorized users.