Background: Angiography, as the gold standard for the diagnosis of coronary artery disease, has made an attempt to predict coronary artery disease by comparing the efficiency of gene expression programming, as a new data mining technique, and artificial neural network, as a conventional technique. Besides, the study went further to present the results of feature selection based on stepwise backward elimination, classification and regression tree.
Methods:The subjects were assessed for nine coronary artery disease risk factors to develop a prediction model for the disease. They included 13,288 patients who were chosen to undergo angiography for the diagnosis of coronary artery disease; from this sample, 4059 subjects were free from the disease while 9169 were suffering from it. Modeling was carried out based on gene expression programming and artificial neural network techniques. The Delong's test was then used to choose the final model based on the area under the Receiver Operating Characteristic (ROC) curve.
Results:The model, developed based on artificial neural network, had AUC of 0.719, accuracy of 73.39%, sensitivity of 93.44% and specificity of 28.34%. On the other hand, the model, formulated based on gene expression programming, had AUC of 0.720, accuracy of 73.94%, sensitivity of 93.29% and specificity of 31.43%. Delong's test showed no significant difference between the two models (p value=0/789). Then, feature selection method was used to choose a model with four risk factors and an accuracy rate of 73.26%.
Conclusion:Comparison of the results showed no significant difference between the two modeling techniques. The gene expression programming model was very easy to present and interpret; it could also be easily converted to other programming languages; so, with these features in mind, the researchers preferred to choose this technique. modeling, some setting initials are necessary, as can be seen in Table 2. Modeling based on ANN was done using a Multilayer Perceptron (MLP) neural network. Also, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm, developed based on a quasi-Newton algorithm, was used for learning the network. This learning algorithm has a faster convergence rate than the gradient descend and the conjugate gradient algorithms and is one of the appropriate learning algorithms [26]. Since there is no equation for estimating parameters such as the number of neurons in the hidden layer, the layer activation function and error function of a neural network model could be adopted. So, with this point in mind, we created 100 neural network models by randomly selecting the parameter value, as can be seen in [29,30]. In line with this procedure, the stepwise backward elimination method was adopted to compare the results of ANN and GEP and to select the best possible model and technique. As such, the least important risk factors were also removed and the modeling process was carried out with the remaining risk factors. This process continued until there was no significant change in th...