3D shape information is one of the very important clues in image processing and computer vision. Unlike traditional multi-input depth from defocus (DFD) technique, monocular DFD (MDFD) algorithm proposed by Hu and Haan can reconstruct 3D shape only from a single monocular defocus image with low computing complexity. In this paper, we present a real-time MDFD system implemented on the FPGA device. In order to reduce the FPGA design cost, vivado high level synthesis (VHLS) is applied to design the MDFD system. The system architecture on the basis of FIFO based convolution is first designed through C/C++ code that is further converted to the FPGA design by VHLS. Then the PIPELINE, LOOP_MERGE, and ARRAY_PARTITION directives are used to optimize the latency and interval of the proposed system. The performance and resource utilization of the whole system are evaluated by processing defocus images from the real scene with 640×480 pixel size. The system can process about 22 images at 20 MHz working frequency and keep the 93.29% depth accuracy on the 3D objects test, which achieves a real-time state-of-the-art MDFD system by comparing to other recent works. INDEX TERMS 3D reconstruction, FPGA, monocular depth from defocus, vivado high level synthesis.