The adoption of transformer networks has experienced a notable surge in various AI applications. However, the increasedcomputational complexity, stemming primarily from the self-attention mechanism, parallels the manner in which convolutionoperations constrain the capabilities and speed of Convolutional Neural Networks (CNNs). The self-attention algorithm,specifically the Matrix-matrix Multiplication (MatMul) operations, demands a substantial amount of memory and computationalcomplexity, thereby restricting the overall performance of the transformer. This paper introduces an efficient hardwareaccelerator for the transformer network, leveraging memristor-based in-memory computing. The design targets the memorybottleneck associated with MatMul operations in the self-attention process, utilizing approximate analog computation andthe highly parallel computations facilitated by the memristor crossbar architecture. Remarkably, this approach resulted in areduction of approximately 10 times in the number of Multiply-Accumulate (MAC) operations in transformer networks, whilemaintaining 93.37% accuracy for the MNIST dataset, as validated by a comprehensive circuit simulator employing NeuroSim3.0. Simulation outcomes indicate an area utilization of 6895.7 μm2, a latency of 15.52 seconds, an energy consumption of3 mJ, and a leakage power of 59.55 μW . The methodology outlined in this paper represents a substantial stride towards ahardware-friendly transformer architecture for edge device, poised to achieve real-time performance.