Recent advancements in deep learning have propelled the exploration of big data-driven fault diagnosis techniques. Nevertheless, traditional models often suffer from prohibitive computational demands, rendering them impractical for on-site deployment in rolling bearing fault diagnosis. To address this challenge, this paper introduces a novel lightweight fault diagnosis model with the pyramid architecture, named Shuffle-Fusion Pyramid Network (Shuffle-FPN). The model heralds several innovations: (1) A pyramid structure is designed to amalgamate fault signals across various scales, enlarging the network’s breadth and curtailing its depth. (2) Depth wise separable convolutions are adopted to streamline network parameters, thus achieving a lightweight model, and channel shuffling to ensure thorough information fusion across convolutional channels. (3) A global representation module is employed to offset the loss of global context, which accompanies increased convolutional depth. Collectively, these innovations enable Shuffle-FPN to extract nuanced fault features amidst noise and to operate on devices with limited memory, ensuring real-time fault diagnosis even in complex environments. Rigorous experiments on public datasets from Paderborn University (PU), supplemented by validations with our research group’s experiment data, reveal that Shuffle-FPN excels in fault identification under severe noise conditions and reduces memory footprint, which solidifies the superiority of Shuffle-FPN for practical fault diagnosis.