27Forecasting how the risk of pathogen spillover changes over space is essential 28 for the effective deployment of interventions such as human or wildlife vacci-29 nation. However, due to the sporadic nature of spillover events, developing 30 robust predictions is challenging. Recent efforts to overcome this obstacle have 31 capitalized on machine learning to predict spillover risk. A weakness of these 32 approaches has been their reliance on human infection data, which is known 33 to suffer from strongly biased reporting. We develop a novel approach that 34 combines sub-models for reservoir species distribution, pathogen distribution, 35 and transmission into the human population. We apply our method to Lassa 36 virus, a zoonotic pathogen with a high threat of emergence in West Africa. The 37 resulting model predicts the distribution of Lassa virus spillover risk and allows 38 us to revise existing estimates for the annual number of new human infections. 39 Our model predicts that between 961,300 -4,037,400 humans are infected by 40 Lassa virus each year, an estimate that exceeds current conventional wisdom. 41 Our model also predicts that Nigeria accounts for more than half of all new 42 Lassa cases in humans, making it a high-risk area for Lassa virus to become an 43 emergent pathogen. 44 2 Keywords 45 Lassa, Machine learning, zoonotic pathogen, emerging infectious disease, spillover, 46 risk map 47 48 Emerging infectious diseases (EIDs) pose a deadly threat to mankind. Approx-49 imately 40% of EIDs are caused by pathogens that circulate in a non-human 50 2 wildlife reservoir (i.e., zoonotic pathogens) [1]. Prior to full scale emergence, in-51 teraction between humans and wildlife creates opportunities for the occasional 52 transfer, or spillover, of the zoonotic pathogen into human populations [2]. 53 These initial spillover cases, in turn, can give an animal-borne pathogen a 54 foothold for genetic mutations that allow increased transmission among hu-55 mans [2, 3]. Consequently, a key step in preempting the threat of EIDs is 56 careful monitoring of when and where spillover into the human population is 57 occurring. However, because the majority of EIDs from wildlife originate in low 58 and middle income regions with limited health system infrastructure, accurately 59 estimating the rate and geographical range of pathogen spillover, and therefore 60 the risk of new EIDs, is a major challenge [1]. 61 Machine learning techniques have shown promise at predicting the geograph-62 ical range of spillover risk for several zoonotic diseases including Lassa fever [4-63 6], Ebola [7], and Leishmaniases [8]. Generally, these models are trained to 64 associate environmental features with the presence or absence of case reports 65 in humans or the associated reservoir. Once inferred from the training process, 66 the learned relationships between disease presence and the environment can be 67 extended across a region of interest. Using these techniques, previous studies of 68 Lassa fever (LF) ...