A large number of epidemics, including COVID-19 and SARS, spread through droplets, quickly swept the world and claimed the precious lives of large numbers of people. Building a low-cost and real-time epidemic early warning system to identify individuals who have been in contact with infected individuals and determine whether they need to be quarantined is an effective means to mitigate the spread of the epidemic. In this paper, we propose a smartphone-based zero-effort epidemic warning method for mitigating epidemic propagation. Firstly, we recognize epidemic-related voice activity relevant to epidemics spread by hierarchical attention mechanism and temporal convolutional network. The hierarchical attention mechanism is used to find local important information through global scanning, then enhances useful information and suppresses useless information. Subsequently, we estimate the social distance between users through sensors built-in smartphone. Furthermore, we combine Wi-Fi network logs and social distance to comprehensively judge whether there is spatiotemporal contact between users and determine the duration of contact. Finally, we estimate infection risk based on epidemic-related vocal activity, social distance, and contact time. We conduct a large number of well-designed experiments in typical scenarios to fully verify the proposed method. The proposed method does not rely on any additional infrastructure and historical training data, which is conducive to integration with epidemic prevention and control systems and large-scale applications.