To understand and forecast biological responses to climate change, scientists frequently use field experiments that alter temperature and precipitation. Climate manipulations can manifest in complex ways, however, challenging interpretations of biological responses. We reviewed publications to compile a database of daily plot‐scale climate data from 15 active‐warming experiments. We find that the common practices of analysing treatments as mean or categorical changes (e.g. warmed vs. unwarmed) masks important variation in treatment effects over space and time. Our synthesis showed that measured mean warming, in plots with the same target warming within a study, differed by up to 1.6 ∘C (63% of target), on average, across six studies with blocked designs. Variation was high across sites and designs: for example, plots differed by 1.1 ∘C (47% of target) on average, for infrared studies with feedback control (n = 3) vs. by 2.2 ∘C (80% of target) on average for infrared with constant wattage designs (n = 2). Warming treatments produce non‐temperature effects as well, such as soil drying. The combination of these direct and indirect effects is complex and can have important biological consequences. With a case study of plant phenology across five experiments in our database, we show how accounting for drier soils with warming tripled the estimated sensitivity of budburst to temperature. We provide recommendations for future analyses, experimental design, and data sharing to improve our mechanistic understanding from climate change experiments, and thus their utility to accurately forecast species’ responses.