In this paper, we demonstrate an FPGA-accelerated design of the computationally efficient symbol-level precoding (SLP) for high-throughput communication systems. The SLP technique recalculates the optimal beam-forming vectors by solving a non-negative least squares problem per every set of transmitted symbols. It exploits the advantages of constructive inter-user interference to minimize the total transmitted power and increase service availability. The benefits of using SLP come with a substantially increased computational load at the gateway. The FPGA design enables the SLP technique to perform in real-time operation mode and provide a high symbol throughput for the multiple receive terminals. We define the SLP technique in a closed-form algorithmic expression and translate it to hardware description language (HDL) and build an optimized HDL core for an FPGA. We evaluate the FPGA resource occupation, which is required for the high throughput multiple-input-multiple-output (MIMO) systems with sizeable dimensions. We describe the algorithmic code, the I/O ports mapping, and the functional behavior of the HDL core. We deploy the IP core to an actual FPGA unit and benchmark the energy efficiency performance of the SLP. The synthetic tests demonstrate a fair energy efficiency improvement of the proposed closed-form algorithm compared to the best results obtained through the MATLAB numerical simulations. INDEX TERMS Convex programming, field programmable gate arrays, hardware resources, multicast communication, MIMO, optimization, precoding, power minimization, interference, wireless channels.