Job shop scheduling problem (JSSP) is essential in the production, which can significantly improve production efficiency. Dynamic events such as machine breakdown and job rework frequently occur in smart manufacturing, making the dynamic job shop scheduling problem (DJSSP) methods urgently needed. Existing rule-based and meta-heuristic methods cannot cope with dynamic events in DJSSPs of different sizes in real time. This paper proposes an end-to-end transformer-based deep learning method named spatial pyramid pooling-based transformer (SPP-Transformer), which shows strong generalizability and can be applied to different-sized DJSSPs. The feature extraction module extracts the production environment features that are further compressed into fixed-length vectors by the feature compression module. Then, the action selection module selects the simple priority rule in real time. The experimental results show that the makespan of SPP-Transformer is 11.67% smaller than the average makespan of dispatching rules, meta-heuristic methods, and RL methods, proving that SPP-Transformer realizes effective dynamic scheduling without training different models for different DJSSPs. To the best of our knowledge, SPP-Transformer is the first application of an end-to-end transformer in DJSSP, which not only improves the productivity of industrial scheduling but also provides a paradigm for future research on deep learning in DJSSP.