The soil organic matter (SOM) content is a key factor affecting the function and health of soil ecosystems. For measurements of land reclamation and soil fertility, SOM monitoring using visible and near-infrared spectroscopy (Vis-NIR) is one approach to quantifying soil quality, and Vis-NIR is important for monitoring the SOM content in a broad and nondestructive manner. To investigate the influence of environmental factors and Vis-NIR spectroscopy in estimating SOM, 249 soil samples were collected from the Werigan–Kuqa oasis in Xinjiang, China, and their spectral reflectance, SOM content and soil salinity were measured. To classify and improve the prediction accuracy, we also take into account the soil salinity content as a variable indicator. Relevant environmental variables were extracted using remote sensing datasets (land-use/land-cover (LULC), digital elevation model (DEM), World Reference Base for Soil Resources (WRB), and soil texture). On the basis of Savitzky–Golay (S-G) smoothing and first derivative (FD) preprocessing of the original spectrum, three clusters were obtained by K-means clustering through the use of Vis-NIR and used as spectral classification variables. Using Vis-NIR as Model 1, Vis-NIR combined with spectral classification as Model 2, environmental variables as Model 3, and the combination of all the above variables (Vis-NIR, spectral classification, environmental variables, and soil salinity) as Model 4, a SOM content estimation model was constructed using partial least squares regression (PLSR). Using the 249 soil samples, the modeling set contained 166 samples and the validation set contained 83 samples. The results showed that Model 2 (validation r2 = 0.78) was better than Model 1 (validation r2 = 0.76). The prediction accuracy for Model 4 (validation r2 = 0.85) was better than Model 2 (validation r2 = 0.78). Among these, Model 3 was the worst (validation r2 = 0.39). Therefore, the combination of environmental variables with Vis-NIR spectroscopy to estimate SOM content is an important method and has important implications for improving the accuracy of SOM predictions in arid regions.