Contact tracing is a method used to control the spread of a pandemic. The objectives of this research are to conduct an empirical review and content analysis to identify the environmental factors causing the spread of the pandemic and to propose an ontology-based big data architecture to collect these factors for prediction. No research studies these factors as a whole in pandemic prediction. The research method used was an empirical study and content analysis. The keywords contact tracking, pandemic spread, fear, hygiene measures, government policy, prevention programs, pandemic programs, information disclosure, pandemic economics, and COVID-19 were used to archive studies on the pandemic spread from 2019 to 2022 in the EBSCOHost databases (e.g., Medline, ERIC, Library Information Science & Technology, etc.). The results showed that only 84 of the 588 archived studies were relevant. The risk perception of the pandemic (n = 14), hygiene behavior (n = 7), culture (n = 12), and attitudes of government policies on pandemic prevention (n = 25), education programs (n = 2), business restrictions (n = 2), technology infrastructure, and multimedia usage (n = 24) were the major environmental factors influencing public behavior of pandemic prevention. An ontology-based big data architecture is proposed to collect these factors for building the spread prediction model. The new method overcomes the limitation of traditional pandemic prediction model such as Susceptible-Exposed-Infected-Recovered (SEIR) that only uses time series to predict epidemic trend. The big data architecture allows multi-dimension data and modern AI methods to be used to train the contagion scenarios for spread prediction. It helps policymakers to plan pandemic prevention programs.