Node classification on graph data is a major problem in machine learning, and various graph neural networks (GNNs) have been proposed. Variants of GNNs such as H2GCN and CPF outperform graph convolutional networks (GCNs) by improving on the weaknesses of the traditional GNN. However, there are some graph data which these GNN variants fail to perform well than other GNNs in the node classification task. This is because H2GCN has similar feature values on graph data with high average degree, and CPF gives rise to a problem about label-propagation suitability. Accordingly, we propose a hierarchical model selection framework (HMSF) that selects an appropriate GNN model to predict the class of nodes for each graph data. HMSF uses average degree and edge homophily ratio as indicators to decide the useful model based on our analyses. In the experiment, we show that the model selected by our HMSF achieves high performance on node classification for various types of graph data.INDEX TERMS Graph neural networks (GNNs), machine learning, model selection, node classification.
Various graph neural networks (GNNs) have been proposed to solve node classification tasks in machine learning for graph data. GNNs use the structural information of graph data by aggregating the feature vectors of neighboring nodes. However, they fail to directly characterize and leverage the structural information. In this paper, we propose a multi-duplicated characterization of graph structures using information gain ratio (IGR) for GNNs (MSI-GNN), which enhances the performance of node classification by using an i-hop adjacency matrix as the structural information of the graph data. In MSI-GNN, the i-hop adjacency matrix is adaptively adjusted by two methods: (i) structural features in the matrix are selected based on the information gain ratio and occurrence filter, and (ii) the selected features in (i) for each node are duplicated and combined flexibly. In an experiment, we show that our MSI-GNN outperforms GCN, H2GCN, and GCNII in terms of average accuracies in benchmark graph datasets.INDEX TERMS Graph neural networks (GNNs), machine learning, node classification, feature selection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.