“…Rather than transforming to a new set of variables, it is often desirable to reduce dimensionality by selecting a subset of the original variables (or features) as this enhances the transparency and interpretability of models [9], and in some applications it is a necessity, for example in sensor selection and measurement plan optimisation problems [10], [11]. Since determining the best subset of variables from a candidate set, with respect to a given performance metric, is an NP-hard combinatorial optimisation problem, practical implementations of unsupervised variable selection usually involve approximate solutions such as those provided by greedy search algorithms [12], [13], [14], [15], [16].…”