Machine learning is commonly used to predict and implement pattern recognition and the relationship between variables. Causal machine learning combines approaches for analyzing the causal impact of intervention on the result, asumming a considerably ambigous variables. The combination technique of causality and machine learning is adequate for predicting and understanding the cause and effect of the results. The aim of this study is a systematic review to identify which causal machine learning approaches are generally used. This paper focuses on what data characteristics are applied to causal machine learning research and how to assess the output of algorithms used in the context of causal machine learning research. The review paper analyzes 20 papers with various approaches. This study categorizes data characteristics based on the type of data, attribute value, and the data dimension. The Bayesian Network (BN) commonly used in the context of causality. Meanwhile, the propensity score is the most extensively used in causality research. The variable value will affect algorithm performance. This review can be as a guide in the selection of a causal machine learning system.
Cardiotocography is a series of inspections to determine the health of the fetus in pregnancy. The inspection process is carried out by recording the baby's heart rate information whether in a healthy condition or contrarily. In addition, uterine contractions are also used to determine the health condition of the fetus. Fetal health is classified into 3 conditions namely normal, suspect, and pathological. This paper was performed to compare a classification algorithm for diagnosing the result of the cardiotocographic inspection. An experimental scheme is performed using feature selection and not using it. CFS Subset Evaluation, Info Gain, and Chi-Square are used to select the best feature which correlated to each other. The data set was obtained from the UCI Machine Learning repository available freely. To find out the performance of the classification algorithm, this study uses an evaluation matrix of precision, Recall, F-Measure, MCC, ROC, PRC, and Accuracy. The results showed that all algorithms can provide fairly good classification. However, the combination of the Random Forest algorithm and the Info Gain Feature Selection gives the best results with an accuracy of 93.74%.
Credit scoring is a model commonly used in the decision-making process to refuse or accept loan requests. The credit score model depends on the type of loan or credit and is complemented by various credit factors. At present, there is no accurate model for determining which creditors are eligible for loans. Therefore, an accurate and automatic model is needed to make it easier for banks to determine appropriate creditors. To address the problem, we propose a new approach using the combination of a machine learning algorithm (Naïve Bayes), Information Gain (IG), and discretization in classifying creditors. This research work employed an experimental method using the Weka application. Australian Credit Approval data was used as a dataset, which contains 690 instances of data. In this study, Information Gain is employed as a feature selection to select relevant features so that the Naïve Bayes algorithm can work optimally. The confusion matrix is used as an evaluator and 10-fold cross-validation as a validator. Based on experimental results, our proposed method could improve the classification performance, which reached the highest performance in average accuracy, precision, recall, and f-measure with the value of 86.29%, 86.33%, 86.29%, 86.30%, and 91.52%, respectively. Besides, the proposed method also obtains 91.52% of the ROC area. It indicates that our proposed method can be classified as an excellent classification.
<p>Tingkat kepuasan mahasiswa terhadap suatu mata kuliah ditentukan oleh banyak faktor, salah satunya adalah kondisi perkuliahan. Evaluasi kondisi perkuliahan dapat dilakukan dengan mahasiswa sebagai responden. Pada penelitian ini, digunakan <em>dataset</em> evaluasi mata kuliah oleh 5280 mahasiswa Gazi University. <em>Dataset</em> terdiri dari 28 pernyataan atas kuesioner tentang evaluasi kondisi perkuliahan dan tingkat kepuasan mahasiswa terhadap mata kuliah. Tujuan penelitian ini adalah untuk menentukan atribut masukan evaluasi kondisi perkuliahan yang berpengaruh signifikan terhadap tingkat kepuasan mahasiswa dalam mengikuti mata kuliah, dengan proses <em>clustering</em>, seleksi atribut, dan klasifikasi. Evaluasi nilai akurasi, ROC, TP <em>rate</em>, dan FP <em>rate</em> menunjukkan bahwa klasifikasi dengan <em>subset</em> hasil seleksi atribut menghasilkan performa yang setara bahkan lebih baik dibandingkan klasifikasi dengan seluruh atribut. <em>Subset</em> atribut yang berasal dari seleksi atribut dengan metode Relief adalah <em>subset</em> terbaik, menghasilkan nilai akurasi 91.08%, nilai ROC 0.942, TP <em>rate</em> 0.911, serta FP <em>rate</em> 0.065. Dari 24 atribut evaluasi kondisi perkuliahan yang digunakan sebagai masukan, terdapat 14 atribut yang signifikan mempengaruhi tingkat kepuasan mahasiswa dalam mengikuti mata kuliah.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.