Low-power miniature systems for ubiquitous computing such as wireless sensor networks have been developing rapidly in the past years. The growing demand on collecting and analyzing information from surrounding environment drives researchers and engineers to develop Internet-of-Things (IoT). This trend requires future integrated circuits for IoT devices to be ultra-low-power (ULP), flexible, and low-cost. Existing circuit solutions of IoT devices are either too costly such as sub-threshold ASICs, or too power-consuming such as sub-threshold microprocessors. ULP FPGAs operating in near/sub-threshold region, flexible and much lower-power than sub-threshold microprocessors, become a promising hardware solution for IoT applications. In this dissertation, circuit/architecture and tool flow of a custom ULP FPGA are explored and developed. 1) Energy E cient FPGA Interconnect The global interconnect is the major power consumer of the core fabric of FPGAs. Studies have shown that over 65% of power is dissipated in the interconnection fabric. The same conclusion applies to delay and area. The strict requirements on both speed and energy of IoT applications make energy reduction and energy-e ciency improvement of FPGA routing fabrics a driving challenge. In this dissertation, an energy-e cient low-swing interconnect is modeled, optimized, and evaluated in near/sub-threshold region. When implementing Microelectronics Center of North Carolina (MCNC) benchmarks, the proposed interconnect leads to 68.4% delay reduction and 47.5% energy reduction compared to prior works. 2) Per-Path Voltage Scaling and Power-Gating Per-path voltage scaling is a technique to reduce FPGA energy to just the minimum while maintaining the overall FPGA speed by reducing the supply List of Figures