Electromagnetic transient (EMT) simulation is one of the most complex power system studies that requires detailed modeling of the study system including all frequency-dependent and nonlinear effects. Large-scale EMT simulation is becoming commonplace due to the increasing growth and interconnection of power grids, and the need to study the impact of system events of the wide area network. To cope with enormous computational burden, the massively parallel architecture of the graphics processing unit (GPU) is exploited in this paper for large-scale EMT simulation. A fine-grained network decomposition, called shattering network decomposition, is proposed to divide the power system network exploiting its topological and physical characteristics into linear and nonlinear networks, which adapt to the unique features of the GPU-based massive thread computing system. Large-scale systems, up to 240 000 nodes, with typical components, including synchronous machines, transformers, transmission lines, and nonlinear elements, and multiple levels modular multilevel converter with up to 6144 submodules, are tested and compared with mainstream simulation software to verify the accuracy and demonstrate the speed-up improvement with respect to sequential computation.