Beamspace processing is an emerging paradigm to reduce hardware complexity in all-digital millimeter-wave (mmWave) massive multiple-input multiple-output (MIMO) basestations. This approach exploits sparsity of mmWave channels but requires spatial discrete Fourier transforms (DFTs) across the antenna array, which must be performed at the baseband sampling rate. To mitigate the resulting DFT hardware implementation bottleneck, we propose a fully-unrolled Streaming MUltiplierLess (SMUL) fast Fourier Transform (FFT) engine that performs one transform per clock cycle. The proposed SMUL-FFT architecture avoids hardware multipliers by restricting the twiddle factors to a sum-of-powers-of-two, resulting in substantial power and area savings. Compared to state-ofthe-art FFTs, our SMUL-FFT ASIC designs in 65 nm CMOS demonstrate more than 45% and 17% improvements in energyefficiency and area-efficiency, respectively, without noticeably increasing the error-rate in mmWave massive MIMO systems.
I. INTRODUCTIONMillimeter-wave (mmWave) communication [1], [2] promises significantly increased data-rates due to the availability of large contiguous frequency bands. Massive multiuser multipleinput multiple-output (MU-MIMO) [3] is a key technology to combat the high path loss of mmWave propagation [2] while enabling simultaneous communication with multiple user equipments (UEs) in the same frequency band. The higher baseband sampling rates needed to support larger bandwidths at mmWave frequencies, combined with the large number of antennas in massive MU-MIMO, result in new challenges for analog and digital hardware design.
A. Fast Fourier Transforms for Beamspace ProcessingMmWave channels typically comprise only a few dominant propagation paths [1], [2], making them sparse in the beamspace domain [4]- [9]. Beamspace processing exploits this sparsity to reduce the computational complexity of baseband processing [10]-[12]. This approach, which is described in detail in Section II-A, calls for spatial discrete Fourier transforms (DFTs) operated at the baseband sampling rate in order to convert the received signals at the antenna array into the beamspace domain-in high-bandwidth mmWave communication systems, billions of spatial DFTs must be computed per second.