The electric network frequency (ENF), often referred to as the industrial heartbeat, plays a crucial role in the power system. In recent years, it has found applications in multimedia evidence identification for court proceedings and audio–visual temporal source identification. This paper introduces an ENF region classification model named UniTS-SinSpec within the UniTS framework. The model integrates the sinusoidal activation function and spectral attention mechanism while also redesigning the model framework. Training is conducted using a public dataset on the open science framework (OSF) platform, with final experimental results demonstrating that, after parameter optimization, the UniTS-SinSpec model achieves an average validation accuracy of 97.47%, surpassing current state-of-the-art and baseline models. Accurate classification can significantly aid in ENF temporal source identification. Future research will focus on expanding dataset coverage and diversity to verify the model’s generality and robustness across different regions, time spans, and data sources. Additionally, it aims to explore the extensive application potential of ENF region classification in preventing crimes such as telecommunications fraud, terrorism, and child pornography.