Many burst-mode applications require high performance for brief time periods between extended sections of low performance operation. Digital circuits supporting such burst-mode applications should work in both the near-threshold regime and the super-threshold regime for brief time periods. This work proposes the structure support of fine-grained ultra dynamic voltage scaling (UDVS) from the traditional strong-inversion region to the near-threshold region, with limitations on the number of power rails. The number, type, and size of the power switches are jointly optimized to minimize the overall energy consumption of the UDVS circuit block, meanwhile satisfying the target delay or frequency requirement at each DVS level. The proposed optimization framework properly accounts for the dynamic energy consumption as well as the leakage energy consumption through all the power switches during both the operation time and stand-by time of the circuit block. Experimental results on 22nm Predictive Technology Model demonstrate the effectiveness of the proposed optimization framework.