Background
Infectious diseases are a major threat to public health, causing serious medical consumption and casualties. Accurate prediction of infectious diseases incidence is of great significance for public health organizations to prevent the spread of diseases. However, only using historical incidence data for prediction can not get good results. This study analyzes the influence of meteorological factors on the incidence of hepatitis E, which are used to improve the accuracy of incidence prediction.
Methods
We extracted the monthly meteorological data, incidence and cases number of hepatitis E from January 2005 to December 2017 in Shandong province, China. We employ GRA method to analyze the correlation between the incidence and meteorological factors. With these meteorological factors, we achieve a variety of methods for incidence of hepatitis E by LSTM and attention-based LSTM. We selected data from July 2015 to December 2017 to validate the models, and the rest was taken as training set. Three metrics were applied to compare the performance of models, including root mean square error(RMSE), mean absolute percentage error(MAPE) and mean absolute error(MAE).
Results
Duration of sunshine and rainfall-related factors(total rainfall, maximum daily rainfall) are more relevant to the incidence of hepatitis E than other factors. Without meteorological factors, we obtained 20.74%, 19.50% for incidence in term of MAPE, by LSTM and A-LSTM, respectively. With meteorological factors, we obtained 14.74%, 12.91%, 13.21%, 16.83% for incidence, in term of MAPE, by LSTM-All, MA-LSTM-All, TA-LSTM-All, BiA-LSTM-All, respectively. The prediction accuracy increased by 7.83%. Without meteorological factors, we achieved 20.41%, 19.39% for cases in term of MAPE, by LSTM and A-LSTM, respectively. With meteorological factors, we achieved 14.20%, 12.49%, 12.72%, 15.73% for cases, in term of MAPE, by LSTM-All, MA-LSTM-All, TA-LSTM-All, BiA-LSTM-All, respectively. The prediction accuracy increased by 7.92%. More detailed results are shown in results section of this paper.
Conclusions
The experiments show that attention-based LSTM is superior to other comparative models. Multivariate attention and temporal attention can greatly improve the prediction performance of the models. Among them, when all meteorological factors are used, multivariate attention performance is better. This study can provide reference for the prediction of other infectious diseases.