Hepatitis is a viral infection that causes inflammation of the liver. However, other factors can cause the disease, including secondary effects from drugs, toxins, alcohol, and autoimmune hepatitis. The autoimmune form of the disease occurs when the body produces antibodies against the liver tissue, and many people worldwide are affected by it. Various clinical factors and parameters are examined in diagnosing hepatitis disease, which is analyzed by performing various tests of these factors and parameters. As a result of the vastness of the parameters under examination, it is challenging and complicated for the experts in this field to perform the analysis process on these parameters on a large scale. Healthcare experts can now identify the factors influencing the death rate of patients with high speed and accuracy thanks to emerging technologies such as machine learning, which is a subset of artificial intelligence. In this study, KNN and SVM machine learning techniques were used to analyze the positive effect of clinical parameters such as LIVER BIG, LIVER FIRM, SPLEEN PALPABLE, and ANOREXIA on patients' survival or death rates. This study investigates and analyzes the results of the implementation in two parts. The first part deals with determining the positive impact of these clinical parameters on the death and survival rate of patients, and the second part examines the performance of machine learning techniques based on the evaluation criteria of accuracy (ACC), error rate (ERR), specificity (SPE), and negative prediction value (NPV).Based on the implementation finding of machine learning techniques on data related to hepatitis patients, it has been determined that patients with positive LIVER BIG, LIVER FIRM, SPLEEN PALPABLE, and ANOREXIA clinical parameters can have a high chance of survival. On the other side, The SVM technique outperformed the KNN technique by ACC 94.05%, ERR 16.02%, SPE 93.07%, and NPV 85.7% in an analysis of the performance of machine learning techniques.
Various data mining techniques are available today, resulting in different results with varying precisions; therefore, selecting the appropriate methodology can result in a more complete and accurate data analysis. Hence, there are several ways to evaluate the effectiveness of data mining techniques. Choosing the appropriate data mining techniques depends on the type of data on which they will be implemented. When it comes to using data, data in every field has its significance. However, data plays a more significant aspect in specific fields, such as healthcare and data collection for caners. Using data mining techniques to analyze sensitive data like cancers can be challenging if the available information is incomplete, which can significantly impact the results. When working with the information of people with lymphoma cancer, the frequency of factors causing the disease and the lack of information are significant challenges. Lymphoma cancers can be classified as either Hodgkin's disease or non-Hodgkin's disease, which are common cancers. In this research, the criterion for selecting factors tumor markers is the presence of commonality between two types of lymphoma cancer. Five tumor markers, CD3, CD15, CD20, CD30, and LCA, along with the type of lymphoma cancer and the patient's gender, were selected as the variables of this research. Hence, to evaluate two data mining techniques, the Bayesian Networks (Naive Bayes), and the decision tree, we will apply the criteria of accuracy, sensitivity, f-score, and error ratio. However, to determine whether lymphoma cancer diagnosis factors have a positive impact, a 90% confidence interval and a 65% support value have been selected to take into account the highest level of accuracy when determining which factor is effective in diagnosing lymphoma cancer. Based on the implementation of techniques and evaluations, it was determined that the decision tree technique outperformed the Bayesian Networks (Naive Bayes) technique with an accuracy of 82.66%, a sensitivity of 94.98%, a harmonic mean of 85.36%, and an error ratio of 17.33%.Our research also concluded that the presence of CD3 and CD15 positive tumor markers, .also the gender of the individual, do not play a role in the diagnosis of lymphoma cancer. However, CD20 and LCA tumor markers can be effective in diagnosing non-Hodgkin's lymphoma, while CD30 tumor markers can be effective in diagnosing Hodgkin's lymphoma.
Early detection is the only way to effectively control diseases whose treatment can be challenging, expensive, and time-consuming. Identifying the influencing factors in the occurrence of disease can, therefore, reduce the time associated with diagnosis and provide a solid foundation for improving the prognosis and preventing patients' deterioration. Applying data mining techniques as a novel approach to the early detection of disease-causing agents can significantly assist the early detection. In this study, an attempt was made to investigate the effect of epidermoid and adeno tissues on the incidence of cancerous diseases such as bone, bone marrow, lung, and neck cancer by conducting a data mining process on cancer patient data sets. Hence, Implementing two data mining techniques, K-nearest neighbor and decision tree, on the data of patients with these four types of cancer, an attempt was made to evaluate their performance using the three criteria of accuracy, error ratio, and negative prediction value. The implementation of data mining techniques and evaluations of their performance indicates that the decision tree technique performed better with an accuracy of 89.10%, an error ratio of 14.04%, and negative prediction value of 77.71%. Also, based on the findings, contamination of epidermoid and adeno tissues does not affect the early detection of any of the four categories of bone, bone marrow, lung, or neck cancer. In other words, the infection of the two epidermoid and adeno tissues cannot be the cause of the four types of bone, bone marrow, lung, and neck cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.