This paper presents a hybrid feature extraction and regression-based machine learning approach for predicting COD concentrations in water samples using spectral data. The method integrates SK-Best and FA to tackle high dimensionality and information redundancy in small datasets. SK-Best identifies key absorbance features, enhancing predictive reliability, while FA reduces dimensionality and extracts valuable information for similarity prediction. The combination of SK-Best, FA, and Linear Regression achieves strong prediction performance (R2 ~ 0.87, MAE = 0.23), demonstrating interpretability, flexibility, and robustness in small datasets. This approach offers a promising solution for real-time water quality monitoring and will be further optimized for broader applications.