The aim of this work was to develop a general framework for the validation of discriminant models based on the Monte Carlo approach that is used in the context of authenticity studies based on chromatographic impurity profiles. The performance of the validation approach was applied to evaluate the usefulness of the diagnostic logic rule obtained from the partial least squares discriminant model (PLS-DA) that was built to discriminate authentic Viagra® samples from counterfeits (a two-class problem). The major advantage of the proposed validation framework stems from the possibility of obtaining distributions for different figures of merit that describe the PLS-DA model such as, e.g., sensitivity, specificity, correct classification rate and area under the curve in a function of model complexity. Therefore, one can quickly evaluate their uncertainty estimates. Moreover, the Monte Carlo model validation allows balanced sets of training samples to be designed, which is required at the stage of the construction of PLS-DA and is recommended in order to obtain fair estimates that are based on an independent set of samples. In this study, as an illustrative example, 46 authentic Viagra® samples and 97 counterfeit samples were analyzed and described by their impurity profiles that were determined using high performance liquid chromatography with photodiode array detection and further discriminated using the PLS-DA approach. In addition, we demonstrated how to extend the Monte Carlo validation framework with four different variable selection schemes: the elimination of uninformative variables, the importance of a variable in projections, selectivity ratio and significance multivariate correlation. The best PLS-DA model was based on a subset of variables that were selected using the variable importance in the projection approach. For an independent test set, average estimates with the corresponding standard deviation (based on 1000 Monte Carlo runs) of the correct classification rate, sensitivity, specificity and area under the curve were equal to 96.42% ± 2.04, 98.69% ± 1.38, 94.16% ± 3.52 and 0.982 ± 0.017, respectively.
In the countries of the European Community, diesel fuel samples are spiked with Solvent Yellow 124 and either Solvent Red 19 or Solvent Red 164. Their presence at a given concentration indicates the specific tax rate and determines the usage of fuel. The removal of these so-called excise duty components, which is known as fuel “laundering”, is an illegal action that causes a substantial loss in a government’s budget. The aim of our study was to prove that genuine diesel fuel samples and their counterfeit variants (obtained from a simulated sorption process) can be differentiated by using their gas chromatographic fingerprints that are registered with a flame ionization detector. To achieve this aim, a discriminant partial least squares analysis, PLS-DA, for the genuine and counterfeit oil fingerprints after a baseline correction and the alignment of peaks was constructed and validated. Uninformative variables elimination (UVE), variable importance in projection (VIP), and selectivity ratio (SR), which were coupled with a bootstrap procedure, were adapted in PLS-DA in order to limit the possibility of model overfitting. Several major chemical components within the regions that are relevant to the discriminant problem were suggested as being the most influential. We also found that the bootstrap variants of UVE-PLS-DA and SR-PLS-DA have excellent predictive abilities for a limited number of gas chromatographic features, 14 and 16, respectively. This conclusion was also supported by the unitary values that were obtained for the area under the receiver operating curve (AUC) independently for the model and test sets.
Counterfeit medicines are a global threat to public health. High amounts enter the European market, which is why characterization of these products is a very important issue. In this study, a high-performance liquid chromatography-photodiode array (HPLC-PDA) and high-performance liquid chromatography-mass spectrometry (HPLC-MS) method were developed for the analysis of genuine Viagra®, generic products of Viagra®, and counterfeit samples in order to obtain different types of fingerprints. These data were included in the chemometric data analysis, aiming to test whether PDA and MS are complementary detection techniques. The MS data comprise both MS1 and MS2 fingerprints; the PDA data consist of fingerprints measured at three different wavelengths, i.e., 254, 270, and 290 nm, and all possible combinations of these wavelengths. First, it was verified if both groups of fingerprints can discriminate between genuine, generic, and counterfeit medicines separately; next, it was studied if the obtained results could be ameliorated by combining both fingerprint types. This data analysis showed that MS1 does not provide suitable classification models since several genuines and generics are classified as counterfeits and vice versa. However, when analyzing the MS1_MS2 data in combination with partial least squares-discriminant analysis (PLS-DA), a perfect discrimination was obtained. When only using data measured at 254 nm, good classification models can be obtained by k nearest neighbors (kNN) and soft independent modelling of class analogy (SIMCA), which might be interesting for the characterization of counterfeit drugs in developing countries. However, in general, the combination of PDA and MS data (254 nm_MS1) is preferred due to less classification errors between the genuines/generics and counterfeits compared to PDA and MS data separately.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.