There’re many models derived from the famous bio-inspired artificial neural network (ANN). Among them, multi-layer perceptron (MLP) is widely used as a universal function approximator. With the development of EDA and recent research work, we are able to use rapid and convenient method to generate hardware implementation of MLP on FPGAs through pre-designed IP cores. In the mean time, we focus on achieving the inherent parallelism of neural networks. In this paper, we firstly propose the hardware architecture of modular IP cores. Then, a parallel MLP is devised as an example. At last, some conclusions are made.