Near-threshold operation enables high energy efficiency, but requires proper design techniques to deal with performance loss and increased sensitivity to process variations. In this paper, we address both issues with two synergistic approaches. First, we introduce a novel body-biasing technique to mitigate the performance loss at near-threshold voltages while not requiring any additional circuitry for the body-bias control, thereby minimizing the design effort and simplifying the systems-on-chip integration. Second, we introduce a novel statistical design methodology to efficiently and accurately evaluate the design guardband strictly needed in the worst case, thereby keeping the area cost of variations at its very minimum. A 65-nm advanced encryption standard testchip demonstrates 1.65× throughput improvement over a baseline design without body biasing, and enables reliable operation over a wide voltage range (0.5-1.2 V) as opposed to traditional body-biasing schemes. In addition, our testchip achieves 1.63× area efficiency improvement compared with a design based on corner analysis. Accordingly, the proposed techniques are well suited for the design of near-threshold specialized hardware with improved performance, reduced silicon area, and design effort.Index Terms-Advanced encryption standard (AES), energy efficiency, forward body-biasing, near-threshold VLSI circuits, process variations, statistical static timing analysis (SSTA), surrogate timing model.