Introduction: In recent decades, the growing rate of cancer incidence is a big concern for most societies. Due to the genetic origins of cancer disease, its internal structure is necessary for the study of this disease. Methods: In this research, cancer data are analyzed based on DNA sequences. The transition probability of occurring two pairs of nucleotides in DNA sequences has Markovian property. This property inspires the idea of feature dimension reduction of DNA sequence for overcoming the high computational overhead of genes analysis. This idea is utilized in this research based on the Markovian property of DNA sequences. This mapping decreases feature dimensions and conserves basic properties for discrimination of cancerous and non-cancerous genes. Results: The results showed that a non-linear support vector machine (SVM) classifier with RBF and polynomial kernel functions can discriminate selected cancerous samples from non-cancerous ones. Experimental results based on the 10-fold cross-validation and accuracy metrics verified that the proposed method has low computational overhead and high accuracy. Conclusion: The proposed algorithm was successfully tested on related research case studies. In general, a combination of proposed Markovian-based feature reduction and non-linear SVM classifier can be considered as one of the best methods for discrimination of cancerous and non-cancerous genes.
In this paper we introduce a new analytical approach for management of waterfloods in heterogeneous reservoirs. The main contribution is the development of a process and metric to evaluate the pair-wise injector-producer (IP) relationships, i.e., to quantify the impact of any injection well on the neighboring producing wells. The proposed metric is particularly designed to consider the non-linearity of the IP relationship between the injection and production rates by using the Mutual Information (MI) data mining tool. Non-linearity of the IP relationship is the main challenge in quantifying this relationship and, to the best of our knowledge, this is the first time that MI is used in the petroleum literature for IP relationship identification. In addition to MI that captures the non-linear correlation in the IP relationship, our metric considers other parameters such as the distance between the IP pair as well as their relative injection and production rates, respectively. Leveraging our proposed metric, we propose a system, for optimal waterflooding with which a field engineer can automatically: 1) Identify the under-performing producers based on their performance characteristics such as wateroil ratio, gas oil ratio, and oil production rate; 2) Rank all injectors based on their impact on the under-performing producers using our proposed IP relationship identification metric; 3) Decide on optimal injection volumes for individual injectors that have the most impact on the under-performing producers and maximize the recovery factor. The proposed technique can significantly reduce the decision-making time for the effective management of complex waterflood.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.