An integrated workflow based on liquid chromatography coupled to a quadrupole-time-of-flight mass spectrometer (LC-QTOF-MS) was developed and applied to detect and identify suspect and unknown contaminants in Greek wastewater. Tentative identifications were initially based on mass accuracy, isotopic pattern, plausibility of the chromatographic retention time and MS/MS spectral interpretation (comparison with spectral libraries, in silico fragmentation). Moreover, new specific strategies for the identification of metabolites were applied to obtain extra confidence including the comparison of diurnal and/or weekly concentration trends of the metabolite and parent compounds and the complementary use of HILIC. Thirteen of 284 predicted and literature metabolites of selected pharmaceuticals and nicotine were tentatively identified in influent samples from Athens and seven were finally confirmed with reference standards. Thirty four nontarget compounds were tentatively identified, four were also confirmed. The sulfonated surfactant diglycol ether sulfate was identified along with others in the homologous series (SO4C2H4(OC2H4)xOH), which have not been previously reported in wastewater. As many surfactants were originally found as nontargets, these compounds were studied in detail through retrospective analysis.
Over the past decade, the application
of liquid chromatography-high
resolution mass spectroscopy (LC-HRMS) has been growing extensively
due to its ability to analyze a wide range of suspected and unknown
compounds in environmental samples. However, various criteria, such
as mass accuracy and isotopic pattern of the precursor ion, MS/MS
spectra evaluation, and retention time plausibility, should be met
to reach a certain identification confidence. In this context, a comprehensive
workflow based on computational tools was developed to understand
the retention time behavior of a large number of compounds belonging
to emerging contaminants. Two extensive data sets were built for two
chromatographic systems, one for positive and one for negative electrospray
ionization mode, containing information for the retention time of
528 and 298 compounds, respectively, to expand the applicability domain
of the developed models. Then, the data sets were split into training
and test set, employing k-nearest neighborhood clustering,
to build and validate the models’ internal and external prediction
ability. The best subset of molecular descriptors was selected using
genetic algorithms. Multiple linear regression, artificial neural
networks, and support vector machines were used to correlate the selected
descriptors with the experimental retention times. Several validation
techniques were used, including Golbraikh–Tropsha acceptable
model criteria, Euclidean based applicability domain, modified correlation
coefficient (r
m
2), and concordance correlation coefficient
values, to measure the accuracy and precision of the models. The best
linear and nonlinear models for each data set were derived and used
to predict the retention time of suspect compounds of a wide-scope
survey, as the evaluation data set. For the efficient outlier detection
and interpretation of the origin of the prediction error, a novel
procedure and tool was developed and applied, enabling us to identify
if the suspect compound was in the applicability domain or not.
Untargeted analysis of a composite house dust sample has been performed as part of a collaborative effort to evaluate the progress in the field of suspect and nontarget screening and build an extensive database of organic indoor environment contaminants. Twenty-one participants reported results that were curated by the organizers of the collaborative trial. In total, nearly 2350 compounds were identified (18%) or tentatively identified (25% at confidence level 2 and 58% at confidence level 3), making the collaborative trial a success. However, a relatively small share (37%) of all compounds were reported by more than one participant, which shows that there is plenty of room for improvement in the field of suspect and nontarget screening. An even a smaller share (5%) of the total number of compounds were detected using both liquid chromatography–mass spectrometry (LC-MS) and gas chromatography–mass spectrometry (GC-MS). Thus, the two MS techniques are highly complementary. Most of the compounds were detected using LC with electrospray ionization (ESI) MS and comprehensive 2D GC (GC×GC) with atmospheric pressure chemical ionization (APCI) and electron ionization (EI), respectively. Collectively, the three techniques accounted for more than 75% of the reported compounds. Glycols, pharmaceuticals, pesticides, and various biogenic compounds dominated among the compounds reported by LC-MS participants, while hydrocarbons, hydrocarbon derivatives, and chlorinated paraffins and chlorinated biphenyls were primarily reported by GC-MS participants. Plastics additives, flavor and fragrances, and personal care products were reported by both LC-MS and GC-MS participants. It was concluded that the use of multiple analytical techniques was required for a comprehensive characterization of house dust contaminants. Further, several recommendations are given for improved suspect and nontarget screening of house dust and other indoor environment samples, including the use of open-source data processing tools. One of the tools allowed provisional identification of almost 500 compounds that had not been reported by participants.
Electronic supplementary material
The online version of this article (10.1007/s00216-019-01615-6) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.