SummaryVarious functional units (FUs) have been designed in modern embedded processors to perform different functions when running an application. Traditionally, compilers did not take this characteristic into account to reduce the temperature of a processor during execution. For many applications, the occurrences of different instructions are not the same after they are compiled. As a consequence, the temperature of the processor is very high, arising from the major heating contribution of the special structure of the active functional unit, and thus, the system will suffer severe damage. Consequently, to remedy this hurdle, this paper provides a solution by shifting the loading from heavy‐loading FUs to light‐loading FUs. Our approach first identifies all the FUs that can exchange the loading among themselves and then presents a thermal model for these exchangeable FUs to estimate the temperature impact on shifting loading. Finally, the loading shifting has been performed by transforming code under the consideration of limited performance loss without hardware cost. The result shows that our approach can reduce the temperature to a small cost of performance degradation and code expansion. Copyright © 2014 John Wiley & Sons, Ltd.