Wavelength selection from a large diffuse reflectance spectroscopy (DRS) dataset enables removal of spectral multicollinearity and thus leads to improved understanding of the feature domain. Feature selection (FS) frameworks are essential to discover the optimal wavelengths for tissue differentiation in DRSbased measurements, which can facilitate the development of compact multispectral optical systems with suitable illumination wavelengths for clinical translation.Aim: The aim was to develop an FS methodology to determine wavelengths with optimal discriminative power for orthopedic applications, while providing the frameworks for adaptation to other clinical scenarios.Approach: An ensemble framework for FS was developed, validated, and compared with frameworks incorporating conventional algorithms, including principal component analysis (PCA), linear discriminant analysis (LDA), and backward interval partial least squares (biPLS).Results: Via the one-versus-rest binary classification approach, a feature subset of 10 wavelengths was selected from each framework yielding comparable balanced accuracy scores (PCA: 94.8 AE 3.47%, LDA: 98.2 AE 2.02%, biPLS: 95.8 AE 3.04%, and ensemble: 95.8 AE 3.16%) to those of using all features (100%) for cortical bone versus the rest class labels. One hundred percent balanced accuracy scores were generated for bone cement versus the rest. Different feature subsets achieving similar outcomes could be identified due to spectral multicollinearity.
Conclusions:Wavelength selection frameworks provide a means to explore domain knowledge and discover important contributors to classification in spectroscopy. The ensemble framework generated a model with improved interpretability and preserved physical interpretation, which serves as the basis to determine illumination wavelengths in optical instrumentation design.