Texture features have been consistently overlooked in digital soil mapping, especially in soil salinization mapping. This study aims to clarify how to leverage texture information for monitoring soil salinization through remote sensing techniques. We propose a novel method for estimating soil salinity content (SSC) that combines spectral and texture information from unmanned aerial vehicle (UAV) images. Reflectance, spectral index, and one-dimensional (OD) texture features were extracted from UAV images. Building on the one-dimensional texture features, we constructed two-dimensional (TD) and three-dimensional (THD) texture indices. The technique of Recursive Feature Elimination (RFE) was used for feature selection. Models for soil salinity estimation were built using three distinct methodologies: Random Forest (RF), Partial Least Squares Regression (PLSR), and Convolutional Neural Network (CNN). Spatial distribution maps of soil salinity were then generated for each model. The effectiveness of the proposed method is confirmed through the utilization of 240 surface soil samples gathered from an arid region in northwest China, specifically in Xinjiang, characterized by sparse vegetation. Among all texture indices, TDTeI1 has the highest correlation with SSC (|r| = 0.86). After adding multidimensional texture information, the R2 of the RF model increased from 0.76 to 0.90, with an improvement of 18%. Among the three models, the RF model outperforms PLSR and CNN. The RF model, which combines spectral and texture information (SOTT), achieves an R2 of 0.90, RMSE of 5.13 g kg−1, and RPD of 3.12. Texture information contributes 44.8% to the soil salinity prediction, with the contributions of TD and THD texture indices of 19.3% and 20.2%, respectively. This study confirms the great potential of introducing texture information for monitoring soil salinity in arid and semi-arid regions.