Route determination
of sulfur mustard was accomplished through
comprehensive nontargeted screening of chemical attribution signatures.
Sulfur mustard samples prepared via 11 different synthetic routes
were analyzed using gas chromatography/high-resolution mass spectrometry.
A large number of compounds were detected, and multivariate data analysis
of the mass spectrometric results enabled the discovery of route-specific
signature profiles. The performance of two supervised machine learning
algorithms for retrospective synthetic route attribution, orthogonal
partial least squares discriminant analysis (OPLS-DA) and random forest
(RF), were compared using external test sets. Complete classification
accuracy was achieved for test set samples (2/2 and 9/9) by using
classification models to resolve the one-step routes starting from
ethylene and the thiodiglycol chlorination methods used in the two-step
routes. Retrospective determination of initial thiodiglycol synthesis
methods in sulfur mustard samples, following chlorination, was more
difficult. Nevertheless, the large number of markers detected using
the nontargeted methodology enabled correct assignment of 5/9 test
set samples using OPLS-DA and 8/9 using RF. RF was also used to construct
an 11-class model with a total classification accuracy of 10/11. The
developed methods were further evaluated by classifying sulfur mustard
spiked into soil and textile matrix samples. Due to matrix effects
and the low spiking level (0.05% w/w), route determination was more
challenging in these cases. Nevertheless, acceptable classification
performance was achieved during external test set validation: chlorination
methods were correctly classified for 12/18 and 11/15 in spiked soil
and textile samples, respectively.