Independent component analysis (ICA) has been used in many applications, including self-interference cancellation in in-band full-duplex wireless communication systems. This paper presents a high-throughput and highly efficient configurable preprocessor for the ICA algorithm. The proposed ICA preprocessor has three major components for centering, for computing the covariance matrix, and for eigenvalue decomposition (EVD). The circuit structures and operational flows for these components are designed both in individual and in joint sense. Specifically, the proposed preprocessor is based on a high-performance matrix multiplication array (MMA) that is presented in this paper. The proposed MMA architecture uses time-multiplexed processing so that the efficiency of hardware utilization is greatly enhanced. Furthermore, the novel processing flow of the proposed preprocessor is highly optimized, so that the centering, the calculation of the covariance matrix, and EVD are conducted in parallel and are pipelined. Thus, the processing throughput is maximized while the required number of hardware elements can be minimized. The proposed ICA preprocessor is designed and implemented with a circuit design flow, and performance estimates based on the post-layout evaluations are presented in this paper. This paper shows that the proposed preprocessor achieves a throughput of 40.7 kMatrices per second for a complexity of 73.3 kGE. Compared with prior work, the proposed preprocessor achieves the highest processing throughput and best efficiency.