The version in the Kent Academic Repository may differ from the final published version. Users are advised to check http://kar.kent.ac.uk for the status of the paper. Users should always cite the published version of record.
EnquiriesFor any further enquiries regarding the licence status of this document, please contact: researchsupport@kent.ac.uk If you believe this document infringes copyright then please contact the KAR admin team with the take-down information provided at http://kar.kent.ac.uk/contact.html Wan, Cen and Freitas, Alex A. (2015) Two methods for constructing a gene ontology-based feature network for a Bayesian network classifier and applications to datasets of aging-related genes.
Citation for published version
ABSTRACTIn the context of the classification task of data mining or machine learning, hierarchical feature selection methods exploit hierarchical relationships among features in order to select a subset of features without hierarchical redundancy. Hierarchical feature selection is a new research area in classification research, since nearly all feature selection methods ignore hierarchical relationships among features. This paper proposes two methods for constructing a network of features to be used by a Bayesian Network Augmented Naïve Bayes (BAN) classifier, in datasets of aging-related genes where Gene Ontology (GO) terms are used as hierarchically related predictive features. One of the BAN network construction method relies on a hierarchical feature selection method to detect and remove hierarchical redundancies among features (GO terms); whilst the other BAN network construction method simply uses a conventional, flat feature selection method to select features, without removing the hierarchical redundancies associated with the GO. Both BAN network construction methods may create new edges among nodes (features) in the BAN network that did not exist in the original GO DAG (Directed Acyclic Graph), in order to preserve the generalization-specialization (ancestor-descendant) relationship among selected features. Experiments comparing these two BAN network construction methods, when using two different hierarchical feature selection methods and one flat feature selection method, have shown that the best results are obtained by the BAN network construction method using one type of hierarchical feature selection method, i.e., select Hierarchical Information-Preserving features (HIP).