A major effect of environment on crops is through crop phenology, and therefore, the capacity to predict phenology as a function of soil, weather, and management is important. Mechanistic crop models are a major tool for such predictions. It has been shown that there is a large variability between predictions by different modeling groups for the same inputs, and therefore, a need for shared improvement of crop models. Two pathways to improvement are through improved understanding of the mechanisms of the modeled system, and through improved model parameterization. This article focuses on improving crop model parameters through improved calibration, specifically for prediction of crop phenology. A detailed calibration protocol is proposed, which covers all the steps in the calibration work-flow, namely choice of default parameter values, choice of objective function, choice of parameters to estimate from the data, calculation of optimal parameter values and diagnostics. For those aspects where knowledge of the model and target environments is required, the protocol gives detailed guidelines rather than strict instructions. The protocol includes documentation tables, to make the calibration process more transparent. The protocol was applied by 19 modeling groups to three data sets for wheat phenology. All groups first calibrated their model using their "usual" calibration approach. Evaluation was based on data from sites and years not represented in the training data. Compared to usual calibration, l calibration following the new protocol significantly reduced the error in predictions for the evaluation data, and reduced the variability between modeling groups by 22%.