Abstract. Hemorrhagic fever with renal syndrome (HFRS) is an important public health problem in Shandong Province, China. In this study, we combined ecologic niche modeling with geographic information systems (GIS) and remote sensing techniques to identify the risk factors and affected areas of hantavirus infections in rodent hosts. Land cover and elevation were found to be closely associated with the presence of hantavirus-infected rodent hosts. The averaged area under the receiver operating characteristic curve was 0.864, implying good performance. The predicted risk maps based on the model were validated both by the hantavirus-infected rodents' distribution and HFRS human case localities with a good fit. These findings have the applications for targeting control and prevention efforts. Ecological niche models. The ENMs were developed to understand environmental variation associated with the distribution of infected rodent reservoirs. 21 A recent introduced presence-only distribution modeling technique-the Maximum Entropy approach, was applied in various domains and achieved high predictive accuracy, [22][23][24][25][26][27][28] and showed the best predictive power across all sample sizes. 3,25,[29][30][31] Detailed descriptions of the Maximum Entropy program (MAXENT, version 3.3.1) can be found in References 25 and 32. There were 33 sample sites positive for hantavirus infection in rodents from the study areas during [2005][2006][2007][2008], and 10,000 background points (2,500 for each year) are sampled by a spatially random method. The importance of EGVs contributing to the distribution of hantavirus infection was determined by three analyses. In the jackknife analysis of the average gain with training and test data, models were respectively created with each individual variable, all the remaining variables and all variables in turn. Next, corresponding results were compared. Second, the average values of area under the curve (AUC) of 10 iterations were compared. Third, the average percentage contribution of each variable was evaluated. In each iteration of the training algorithm, the increase or decrease in regularized gain was added or subtracted with the input of the corresponding variable, giving a heuristic estimate of variable contribution for the model.
25The final model predictors were selected using a stepwise fashion, as saturated models are likely to be oversized, overfitted, or redundant. 33,34 To determine variable significance, several models using the same occurrence data but different variable sets were examined. These included models with single predictors alone, as well as leaving out individual predictors from suites of variables. The loss in modeling performance for individual models were compared with the model generated using all predictors. The algorithm converges to the optimum probable distribution, and the gain is interpreted as representing how much better the distribution fits the sample points than a uniform distribution. 25,29,32 Model evaluation. To validate the accuracy and p...