The aim of the work is to study the metabolic characteristics of saliva in lung cancer for use in early diagnosis and determining the prognosis of the disease. The patient group included 425 lung cancer patients, 168 patients with non-cancerous lung diseases, and 550 healthy volunteers. Saliva samples were collected from all participants in the experiment before treatment and 34 biochemical saliva parameters were determined. Participants were monitored for six years to assess survival rates. The statistical analysis was performed by means of Statistica 10.0 (StatSoft) program and R package (version 3.2.3). To construct the classifier, the Random Forest method was used; the classification quality was assessed using the cross-validation method. Prognostic factors were analyzed by multivariate analysis using Cox’s proportional hazard model in a backward step-wise fashion to adjust for potential confounding factors. A complex of metabolic changes occurring in saliva in lung cancer is described. Seven biochemical parameters were identified (catalase, triene conjugates, Schiff bases, pH, sialic acids, alkaline phosphatase, chlorides), which were used to construct the classifier. The sensitivity and specificity of the method were 69.5% and 87.5%, which is practically not inferior to the diagnostic characteristics of markers routinely used in the diagnosis of lung cancer. Significant independent factors in the poor prognosis of lung cancer are imidazole compounds (ICs) above 0.478 mmol/L and salivary lactate dehydrogenase activity below 545 U/L. Saliva has been shown to have great potential for the development of diagnostic and prognostic tests for lung cancer.
Background: From a mathematical point of view, the problems of medical diagnostics are the tasks of data classification. It is important to understand how significant distortions can contribute to the result of classification errors in the collection of primary diagnostic information, in particular, the results of biochemical tests.Aims: Determination of the dependence of the prediction result on the variability of the primary diagnostic information on the example of the model classifier.Materials and methods: The case-control study enrolled patients who were divided into 2 groups: the main (diagnosed with lung cancer, n=200) and the control group (conditionally healthy, n=500). Questioning and biochemical saliva study was performed in all participants. Patients of the main group and the comparison group were hospitalized for surgical treatment, after which carried out the histological verification of the diagnosis. The biochemical composition of saliva is determined spectrophotometrically. Based on the data obtained, a model classifier for the diagnosis of lung cancer (a random forest) has been constructed. In each parameter underlying the classifier, deviations were made in the specified range (±1–5%, ±5–10%, ±10–15%), creating synthetic images. Then, the results of the classification were evaluated by the cross-validation method.Results: The basic diagnostic characteristics of the model classifier are determined (sensitivity ― 72.5%, specificity ― 86.0%). As the deviations of synthetic images from the baseline increase, diagnostic characteristics deteriorate with the general classification. However, the result of a confident classification, on the contrary, gives higher values (sensitivity ― 81.8%, specificity ― 93.1%). In case of a confident classification, similar images that fall into different classes according to the classification results are deleted, whereas in the case of a general classification, they are taken into account. The difference between methods of classification is associated with the presence of images on which the classifier gives the result of belonging to the class in the range of 0.45–0.55. Therefore, it is necessary to introduce a third class into the classifier, the so-called gray zone (0.4–0.6), since the probability of making an erroneous diagnosis in this area is significantly increased.Conclusions: The obtained results allow to conclude that the measurement error in the range (±1–15%) does not significantly affect the quality of the classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.