Abstract-In this paper, implementation of a detector with parallel partial candidate-search algorithm is described. Two fully independent partial candidate search processes are simultaneously employed for two groups of transmit antennas based on QR decomposition (QRD) and QL decomposition (QLD) of a multiple-input multiple-output (MIMO) channel matrix. By using separate simultaneous candidate searching processes, the proposed implementation of QRD-QLD searching-based sphere detector provides a smaller latency and a lower computational complexity than the original QRD-M detector for similar error-rate performance in wireless communications systems employing four transmit and four receive antennas with 16-QAM or 64-QAM constellation size. It is shown that in coded MIMO orthogonal frequency division multiplexing (MIMO OFDM) systems, the detection latency and computational complexity of a receiver can be substantially reduced by using the proposed QRD-QLD detector implementation. The QRD-QLD-based sphere detector is also implemented using Field Programmable Gate Array (FPGA) and application specific integrated circuit (ASIC), and its hardware design complexity is compared with that of other sphere detectors reported in the literature.Index Terms-ASIC, bounded soft sphere detector, Field Programmable Gate Array (FPGA), implementation complexity, latency analysis, QL decomposition (QLD), QR decomposition (QRD), QRD-M, QRD-QLD-based parallel candidate search.