Dairying in Australia is practiced in highly diverse climatic conditions and production systems, which means that re-ranking of genotypes could occur across environments that vary in temperature and humiditythat is, genotype-by-environment interactions (G × E) may exist. The objective of this study was to investigate G × E for heat tolerance with respect to milk production traits in Australian Holsteins. A total of 6.7 million test-day milk yield records for first, second, and third lactations from 491,562 cows and 6,410 sires that had progeny in different climatic environments were included in the analysis. The environmental gradient used was the temperature-humidity index (THI) calculated from climate data from 163 Australian public weather stations between 2003 and 2017. Data were analyzed using univariate reaction norm (RM) sire model, and the results were compared with multitrait model (MT). The MT analysis treated test-day yields at 5th percentile (THI = 61; i.e., thermoneutral conditions), 50th percentile (THI = 67; i.e., moderate heat stress conditions), and 95th percentile (THI = 73; i.e., high heat stress conditions) of the trajectory of THI as correlated traits. A THI series of 61, 67, and 73, for example, is equivalent to average temperature and relative humidity of approximately 20°C and 45%, 25°C and 45%, and 31°C and 50%, respectively. We observed some degree of heterogeneity of additive (AG) and permanent environmental (PE) variance over the trajectory THI from RM analysis, with estimates decreasing at higher THI values more steeply for PE than for AG variance. The genetic correlations of the tests between the 5th and 95th percentiles of THI for milk, protein, and fat yield from RM were 0.88 ± 0.01 (standard error), 0.79 ± 0.01, and 0.86 ± 0.01, respectively , whereas the corresponding estimates from MT were 0.86 ± 0.02, 0.84 ± 0.03, and 0.87 ± 0.03. We observed lower genetic correlations between the 5th and 95th percentiles of THI for milk tests from recent years (i.e., 2009 and 2017) compared with earlier years (i.e., 2003 and 2008), which suggests that the level of G × E is increasing in the studied population and should be monitored especially in anticipation of future expected increase in daily average temperature and frequency of heat events. Overall, our results indicate presence of G × E at the upper extreme of the trajectory of THI, but the current extent of sire re-ranking may not justify providing separate genetic evaluations for different levels of heat stress. However, variations observed in the sire sensitivity to heat stress suggest that dairy herds in high heat load conditions could benefit more from using heat-tolerant or resilient sires.