DNA sequence classification is a challenging problem of central importance to genomic research. A number of algorithms have been developed for this problem and implemented as systems that are in widespread use today. In this article, we discuss two systems to predict cancer based on the signal processing of the DNA sequence. In the first method, a system is proposed to predict cancer from DNA sequence through an efficient concept of weighted entropy that takes both entropy and total correlation into consideration. Here, a measure, named as weighted entropy is used which captures the distribution and correlation information of a DNA sequence. In the second method, a decision tree based cancer prediction is proposed, where a split point measure is used namely weighted entropy for pruning the decision tree. For the analysis, we make use of DNA sequences obtained from National Centre for Biotechnology Information (NCBI). Evaluation metrics parameters of sensitivity, specificity and accuracy are found out and compared to the existing literature. Experimental results demonstrate that, this proposed scheme has achieved an accuracy of 90% in classifying DNA sequences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.