In respect of the accuracy, one of the well-known techniques for human detection is the histogram-oriented gradients (HOG) method. Unfortunately, the HOG feature calculation is highly complex and computationally intensive. Thus, in this research, we aim to achieve a resource-efficient and low-power HOG hardware architecture while maintaining its high frame-rate performance for real-time processing. A hardware architecture for human detection in 2D images using simplified HOG algorithm was introduced in this paper. To increase the frame-rate, we simplify the HOG computation while maintaining the detection quality. In the hardware architecture, we design a cell-based processing method instead of a window-based method. Moreover, 64 parallel and pipeline architectures were used to increase the processing speed. Our pipeline architecture can significantly reduce memory bandwidth and avoid any external memory utilization. an altera field programmable gate arrays (FPGA) E2-115 was employed to evaluate the design. The evaluation results show that our design achieves performance up to 86.51 frame rate per second (Fps) with a relatively low operating frequency (27 MHz). It consumes 48,360 logic elements (LEs) and 4,363 registers. The performance test results reveal that the proposed solution exhibits a trade-off between Fps, clock frequency, the use of registers, and Fps-to-clock ratio.