Based on data from the Center for Data and Information Ministry of Health, in 2012 about 8.2 million cases of death caused by cancer. Recent developments show that DNA microarray technology is able to handle cancer detection problems early on, but the main disadvantage of microarray is the problem of curse of dimensionality. Analysis of Variance (ANOVA) is one of the feature selection methods that can overcome the weakness of microarray. ANOVA can find an informative gene pair that can assist in the classification process performed by the Support Vector Machine (SVM). In SVM, the kernel trick when learning model is helpful in overcoming the feature space problem. The selection of the kernel affects the resulting accuracy. Through a series of processes such as correlation calculations, feature selection and classification using SVM, accuracy is obtained from the four datasets used. For leukemia and ovarian cancer datasets, the greatest accuracy is generated by the polynomial kernel at 100% and 97.54% with the parameter values of 5 .. As for the largest lung cancer accuracy dataset obtained from linear kernel that is equal to 100% with the parameter value 0 . 1 C and for the dataset colon tumor the greatest accuracy is obtained from the RBF kernel of 85.15% with the parameter value 5 . 1 C 5 . 0 . The kernel difference that produces the highest accuracy on each dataset is highly dependent on the characteristics of the cancer dataset itself. Keywords: cancer detection, DNA microarray, dimension reduction, correlation, analysis of variance, support vector machine, kernel trick AbstrakBerdasarkan data dari Pusat Data dan Informasi Kementrian Kesehatan RI, di tahun 2012 sekitar 8,2 juta kasus kematian disebabkan oleh kanker. Perkembangan terakhir menunjukan bahwa teknologi DNA microarray mampu menangani masalah deteksi kanker sejak dini, namun kelemahan utama dari microarray adalah masalah curse of dimensionality. Analysis of Variance (ANOVA) merupakan salah satu metode seleksi fitur yang dapat mengatasi kelemahan microarray. ANOVA dapat menemukan pasangan gen informatif yang dapat membantu dalam proses pengklasifikasian yang dilakukan oleh Support Vector Machine (SVM). Dalam SVM, kernel trick saat learning model sangat membantu dalam mengatasi masalah feature space. Pemilihan kernel berpengaruh terhadap akurasi yang dihasilkan. Melalui serangkaian proses seperti perhitungan korelasi, seleksi fitur dan pengklasifikasian menggunakan SVM, didapatkan akurasi dari empat dataset yang digunakan. Untuk dataset leukimia dan ovarian cancer, akurasi terbesar dihasilkan oleh kernel polynomial yaitu sebesar 100% dan 97,54% dengan nilai parameter 5 .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.