Longest prefix matching (LPM) is a fundamental process in IP routing used not only in traditional hardware routers but also in software middleboxes. However, the performance of LPM in software is still insufficient for processing packets at over 100 Gbps, although previous studies have tackled this issue by exploiting the CPU cache or accelerators such as GPUs. To improve the performance of software LPM further, we propose a novel LPM method called Spider, which exploits a single-instruction multiple-data (SIMD) mechanism in the CPU. Spider achieves performing LPM for up to 16 destination IP address in parallel by a routing table structure carefully designed for processing by the SIMD instructions. We evaluated Spider from the following three perspectives: the improvement of LPM performance derived from the parallelism provided by the SIMD mechanism, performance comparison with other methods, and performance scalability. The evaluation shows that Spider dramatically improves the LPM performance, which reaches 1.8-3.2 times compared with the state-of-the-art methods. Moreover, Spider achieves 5,074 million lookups per second with 16 CPU cores, which is equivalent to the processing capacity of 3.4 Tbps in short packets; the performance opens up the possibility of packet processing at the terabit-class rate by software.INDEX TERMS IP routing, longest prefix matching (LPM), single-instruction multiple-data (SIMD), software middlebox.