BackgroundMachine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers.ResultsIn the first stage of this research, five feature selection methods have been proposed and experimented on the oral cancer prognosis dataset. In the second stage, the model with the features selected from each feature selection methods are tested on the proposed classifiers. Four types of classifiers are chosen; these are namely, ANFIS, artificial neural network, support vector machine and logistic regression. A k-fold cross-validation is implemented on all types of classifiers due to the small sample size. The hybrid model of ReliefF-GA-ANFIS with 3-input features of drink, invasion and p63 achieved the best accuracy (accuracy = 93.81%; AUC = 0.90) for the oral cancer prognosis.ConclusionsThe results revealed that the prognosis is superior with the presence of both clinicopathologic and genomic markers. The selected features can be investigated further to validate the potential of becoming as significant prognostic signature in the oral cancer studies.
Automated plant species identification system could help botanists and layman in identifying plant species rapidly. Deep learning is robust for feature extraction as it is superior in providing deeper information of images. In this research, a new CNN-based method named D-Leaf was proposed. The leaf images were pre-processed and the features were extracted by using three different Convolutional Neural Network (CNN) models namely pre-trained AlexNet, fine-tuned AlexNet and D-Leaf. These features were then classified by using five machine learning techniques, namely, Support Vector Machine (SVM), Artificial Neural Network (ANN), k-Nearest-Neighbour (k-NN), Naïve-Bayes (NB) and CNN. A conventional morphometric method computed the morphological measurements based on the Sobel segmented veins was employed for benchmarking purposes. The D-Leaf model achieved a comparable testing accuracy of 94.88% as compared to AlexNet (93.26%) and fine-tuned AlexNet (95.54%) models. In addition, CNN models performed better than the traditional morphometric measurements (66.55%). The features extracted from the CNN are found to be fitted well with the ANN classifier. D-Leaf can be an effective automated system for plant species identification as shown by the experimental results.
Although most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative machine learning approach to analyse multiple gene expression profiles in cervical cancer in order to identify a set of genetic markers that are associated with and may eventually aid in the diagnosis or prognosis of cervical cancers. The proposed integrative analysis is composed of three steps: namely, (i) gene expression analysis of individual dataset; (ii) meta-analysis of multiple datasets; and (iii) feature selection and machine learning analysis. As a result, 21 gene expressions were identified through the integrative machine learning analysis which including seven supervised and one unsupervised methods. A functional analysis with GSEA (Gene Set Enrichment Analysis) was performed on the selected 21-gene expression set and showed significant enrichment in a nine-potential gene expression signature, namely PEG3, SPON1, BTD and RPLP2 (upregulated genes) and PRDX3, COPB2, LSM3, SLC5A3 and AS1B (downregulated genes).
BACKGROUND Chili is one of the most important and high‐value vegetable crops worldwide. However, pest and disease infections are among the main limiting factors in chili cultivation. These diseases cannot be eradicated but can be handled and monitored to mitigate the damage. Hence, the use of an automated identification system based on images will promote quick identification of chili disease. The features extracted from the images are of utmost importance to develop such an accurate identification system. RESULTS In this research, chili pest and disease features extracted using the traditional approach were compared with features extracted using a deep‐learning‐based approach. A total of 974 chili leaf images were collected, which consisted of five types of diseases, two types of pest infestations, and a healthy type. Six traditional feature‐based approaches and six deep‐learning feature‐based approaches were used to extract significant pests and disease features from the chili leaf images. The extracted features were fed into three machine learning classifiers, namely a support vector machine (SVM), a random forest (RF), and an artificial neural network (ANN) for the identification task. The results showed that deep learning feature‐based approaches performed better than the traditional feature‐based approaches. The best accuracy of 92.10% was obtained with the SVM classifier. CONCLUSION A deep‐learning feature‐based approach could capture the details and characteristics between different types of chili pests and diseases even though they possessed similar visual patterns and symptoms. © 2020 Society of Chemical Industry
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.