Tail-biting convolutional codes (TBCC) find applications in many modern-day communication standards such as LTE and IEEE 802.16e. Since tail-biting convolutional codes do not require a zero-tail, they achieve a better coding efficiency than their traditional counterparts. However, the absence of a zero-tail drastically increases the complexity of a standard maximum-likelihood decoder, making its implementation impractical. However, recently a decoder based on the Viterbi and A* algorithm has been proposed that achieves maximum likelihood performance with significantly reduced complexity. This paper presents an efficient hardware implementation of this algorithm for TBCCs corresponding to both LTE and IEEE 802.16e standards. The designs have been tested on a Xilinx Spartan 3E starter kit, achieving a throughput of 141 Mbps and 130 Mbps for the LTE and IEEE 802.16e TBCCs, respectively.