In ectotherms, the performance of physiological, ecological, and life-history traits universally increases with temperature to a maximum before decreasing again. Identifying the most appropriate thermal performance model for a specific trait type has broad applications, from metabolic modelling at the cellular level to forecasting the effects of climate change on population, ecosystem, and disease transmission dynamics. To date, numerous mathematical models have been designed, but a thorough comparison among them is lacking. In particular, we do not know if certain models consistently outperform others, and how factors such as sampling resolution and trait or organismal identity influence model performance. These have led researchers to select models semi-arbitrarily, potentially introducing biases that compromise environmental suitability predictions. To fill these gaps, we collect 2,739 thermal performance datasets from diverse traits and taxa, to which we fit 83 models used in previous studies. We detect remarkable variation in model performance which is not primarily driven by sampling resolution, trait type, or taxonomic information. Our results highlight a lack of well-defined scenarios in which certain models are more appropriate than others. Therefore, to avoid likely model-specific biases, future studies should perform model selection to identify the most appropriate model(s) for their data.