To defend against quantum computer attacks, the National Institute of Standards and Technology (NIST) has been exploring post-quantum cryptography (PQC). Now, NIST has standardised only two PQC algorithms, one of which is the Leighton-Micali signature (LMS). However, the performance of LMS limits its practical application. In this paper, we propose a parallel LMS implementation on multiple nodes. Considering different application scenarios, we provide two parallel schemes: algorithmic parallelism and data parallelism. The main part of our work is the two-tier parallel structure for the LMS tree. Targeting the x86/64 multiple nodes, our work introduces vectorization to present the three-tier parallel structure. We also design communication optimization, including the selection of communication primitives and the creation of communicators for multi-node running. Experimental evidence shows that our code effectively reduces the latency, and is 19.04× faster than the fastest implementation on the same platform when running key pair generation for LMS SHA256 M32 H20(20).