State-of-the-art FPGAs integrate multi-million gate configurable logic and heterogeneous hardware components. They are an attractive choice for implementing Viterbi decoders. As more emphasis is placed on time and energy performance, previous FPGA implementations of Viterbi decoders either fail to provide high data throughput or are not energy efficient. In this paper, we propose an architecture for implementing Viterbi decoders on FPGAs. Our architecture can provide various throughput and energy trade-offs. Considering the throughput/energy performance metric, experimental results show that our design achieves improvements up to 26.1% compared with the previous designs.