CRYSTALS-Kyber (Kyber) was recently chosen as the first quantum resistant Key Encapsulation Mechanism (KEM) scheme for standardisation, after three rounds of the National Institute of Standards and Technology (NIST) initiated PQC competition which begin in 2016 and search of the best quantum resistant KEMs and digital signatures. Kyber is based on the Module-Learning with Errors (M-LWE) class of Lattice-based Cryptography, that is known to manifest efficiently on FPGAs. This work explores several architectural optimizations and proposes a high-performance and area-time (AT) product efficient hardware accelerator for Kyber. The proposed architectural optimizations include inter-module and intra-module pipelining, that are designed and balanced via FIFO based buffering to ensure maximum parallelisation. The implementation results show that compared to state-of-the-art designs, the proposed architecture delivers 25-51% speedups for Kyber's three different security levels on Artix-7 and Zynq UltraScale+ devices, and a 50-75% reduction in DSPs at comparable security level. Consequently, the proposed design achieve higher AT product efficiencies of 19-33%.