Some recent studies have shown that association rules can reveal the interactions between genes that might not have been revealed using traditional analysis methods like clustering. However, the existing studies consider only the association rules among individual genes. In this paper, we propose a new data mining method named MAGO for discovering the multilevel gene association rules from the gene microarray data and the concept hierarchy of Gene Ontology (GO). The proposed method can efficiently find out the relations between GO terms by analyzing the gene expressions with the hierarchy of GO. For example, with the biological process in GO, some rules like Process A (up) → Process B (up) cab be discovered, which indicates that the genes involved in Process B of GO are likely to be up-regulated whenever those involved in Process A are upregulated. Moreover, we also propose a constrained mining method named CMAGO for discovering the multilevel gene expression rules with user-specified constraints. Through empirical evaluation, the proposed methods are shown to have excellent performance in discovering the hidden multilevel gene association rules.
In the past, we proposed a time series segmentation approach by combining the clustering technique, the Discrete Wavelet Transformation (DWT) and the genetic algorithm to automatically find segments and patterns from a time series. In this paper, we propose a PIP-based evolutionary approach, which uses Perceptually Important Points (PIP) instead of DWT, to effectively adjust the length of subsequences for finding appropriate segments and patterns and avoiding some problems in the previous approach. For achieving the purpose, the enhanced suitability factor in the fitness function which is modified from the previous approach, is designed. Experimental results on a real financial dataset also show the effectiveness of the proposed approach.
Selecting informative genes is one of the most important issues for deciphering biological information hidden in gene expression data. However, due to the characteristics of microarray data with small samples and large number of genes, general feature selection methods that are not biologically relevant become questionable. In this paper, we propose a novel classification method for microarray data by integrating the multi-information based gene scoring method with biological information. Through experimental evaluation, our proposed method is shown to deliver good accuracy in classification and provide biologists with deeper insights into the relations between genes and gene function categories.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.