We propose a hardware-friendly architecture of a convolutional neural network using a 32 × 32 memristor crossbar array having an overshoot suppression layer. The gradual switching characteristics in both set and reset operations enable the implementation of a 3-bit multilevel operation in a whole array that can be utilized as 16 kernels. Moreover, a binary activation function mapped to the read voltage and ground is introduced to evaluate the result of training with a boundary of 0.5 and its estimated gradient. Additionally, we adopt a fixed kernel method, where inputs are sequentially applied to a crossbar array with a differential memristor pair scheme, reducing unused cell waste. The binary activation has robust characteristics against device state variations, and a neuron circuit is experimentally demonstrated on a customized breadboard. Thanks to the analogue switching characteristics of the memristor device, the accurate vector−matrix multiplication (VMM) operations can be experimentally demonstrated by combining sequential inputs and the weights obtained through tuning operations in the crossbar array. In addition, the feature images extracted by VMM during the hardware inference operations on 100 test samples are classified, and the classification performance by off-chip training is compared with the software results. Finally, inference results depending on the tolerance are statistically verified through several tuning cycles.