Determining sample similarity underlies many foundational
principles
in analytical chemistry. For example, calibration models are unsuitable
to predict outliers. Calibration transfer methods assume a moderate
degree of sample and measurement dissimilarities between a calibration
set and target prediction samples. Classification approaches link
target sample similarities to groups of similar class samples. Although
similarity is ubiquitous in analytical chemistry and everyday life,
quantifying sample similarity is without a straightforward solution,
especially when target domain samples are unlabeled and the only known
features are measurable, such as spectra (the focus of this paper).
The process proposed to assess sample similarity integrates spectral
similarity information with contextual considerations among source
analyte contents, model, and analyte predictions. This hybrid approach
named the physicochemical responsive integrated similarity measure
(PRISM) amplifies hidden-but-essential physicochemical properties
encoded within respective spectra. PRISM is tested on four near-infrared
(NIR) data sets for four diverse application areas to show efficacy.
These applications are the assessment of prediction reliability and
model updating for model generalizability, outlier detection, and
basic matrix matching evaluation. Discussion is provided on adapting
PRISM to classification problems. Results indicate that PRISM collects
large amounts of similarity information and effectively integrates
it to produce a quantitative similarity evaluation between the target
sample and a source domain. The approach is also useful for biological
samples with additional physiochemical variations. While PRISM is
dynamically tested on NIR data, parts of PRISM were previously applied
to other data types, and PRISM should be applicable to other measurement
systems perturbed by matrix effects.