Hyperspectral technology offers a promising alternative to traditional methods for investigating soil arsenic (As) contamination. However, the relationship between soil arsenic content and spectra may involve complex non-linear dynamics and data redundancy. Therefore, selecting spectral features and constructing models for rapid estimation has become a focal point in current research. In this study, soil samples were collected from an abandoned non-ferrous metal factory area, serving as the research subject, and hyperspectral data within the visible/near-infrared (400–1000 nm) range were acquired. The original spectral data underwent preprocessing using Savitzky-Golay smoothing (SG), Multiple Scattering Correction (MSC), and first-order derivative transformation (FD). Subsequently, the dataset was partitioned using the SPXY algorithm, and bands relevant to heavy metal arsenic (As) content were identified through Spearman correlation analysis.Various feature selection algorithms were then combined with the Extended Feature Algorithm (EFA) to determine the pertinent bands. Finally, a regression prediction was conducted using the selected bands as independent variables and arsenic (As) content as the dependent variable. This was achieved by constructing an Improved Particle Swarm Optimization-Support Vector Machine Regression model (IPSO-SVMR).According to the model evaluation criteria, the band combination of the ICO-SPA feature selection algorithm combined with EFA yielded an R2 of 0.87435, an RMSE of 22.374, and an RPD of 2.8211 on the validation set, indicating its superiority as the best model constructed.This study provides an effective method for the rapid estimation of heavy metal arsenic content.