Integrating networks-on-chip (NoCs) on FPGAs can improve device scalability and facilitate design by abstracting communication and simplifying timing closure, not only between modules in the FPGA fabric but also with large "hard" blocks such as high-speed I/O interfaces. We propose mixed and hard NoCs that add less than 1% area to large FPGAs and run 5-6× faster than the soft NoC equivalent. A detailed power analysis, per NoC component, shows that routers consume 14× less power when implemented hard compared to soft, and whether hard or soft most of the router's power is consumed in the input modules for buffering. For complete systems, hard NoCs consume less than 6% (and as low as 3%) of the FPGA's dynamic power budget to support 100 GB/s of communication bandwidth. We find that, depending on design choices, hard NoCs consume 4.5-10.4 mJ of energy per GB of data transferred. Surprisingly, this is comparable to the energy efficiency of the simplest traditional interconnect on an FPGA -soft point-to-point links require 4.7 mJ/GB. In many designs, communication must include multiplexing, arbitration and/or pipelining. For all these cases, our results indicate that a hard NoC will be more energy efficient than the conventional FPGA fabric.