The application of remote sensing data has been significant in modeling soil erosion. However, previous studies have fallen short in elucidating and lacked an understanding of the multifactor influencing erosion. This study addresses these limitations by employing the InVEST and the Geodetector models. Specifically, it aims (1) to delineate both spatial and temporal variations in soil erosion within the Citarum watershed from 2010 to 2020, (2) to identify the key drivers of soil erosion and unravel the underlying mechanisms, and (3) to identify the high-risk zones for soil erosion. Both models consider a range of natural predictors, including topography (slope factor), climate (precipitation factor), and vegetation cover (vegetation factor). In addition, they incorporate social parameters such as income per capita and population density, which interact with the watershed’s position in the downstream, middle, and upper streams. The results reveal that, over a decade, the average soil erosion increased by 15.50 × 106 tons, marking a 16.65% surge. The impact of factors varies significantly across different subwatershed areas. For example, fraction vegetation cover interactions influence upper- and middle-stream regions, while the downstream area is notably affected by precipitation interactions. The high-risk erosion areas in the watershed are primarily influenced by slope, precipitation, and fractional vegetation cover. In these areas, factors causing high erosion risks include slope, precipitation, and other environmental variables categorized into strata. The study highlights the varying influential factors in different watershed areas.