This paper focuses on how to efficiently reduce power consumption in coarse-grained reconfigurable designs, to allow their effective adoption in heterogeneous architectures supporting and accelerating complex and highly variable multifunctional applications. We propose a design flow for this kind of architectures that, besides their automatic customization, is also capable of determining their optimal power management support. Power and clock gating implementation costs are estimated in advance, before their physical implementation, on the basis of the functional, technological, and architectural parameters of the baseline design. Experimental results, on 90 and 45 nm CMOS technologies, demonstrate that the proposed approach guides the designer towards optimal implementation.