This paper introduces a novel general-purpose algorithm for Pauli decomposition that employs matrix slicing and addition rather than expensive matrix multiplication, significantly accelerating the decomposition of multi-qubit matrices. In a detailed complexity analysis, we show that the algorithm admits the best known worst-case scaling and more favorable runtimes for many practical examples. Numerical experiments are provided to validate the asymptotic speed-up already for small instance sizes, underscoring the algorithm's potential significance in the realm of quantum computing and quantum chemistry simulations.