2021
DOI: 10.48550/arxiv.2108.10521
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

Abstract: Training deep graph neural networks (GNNs) is notoriously hard. Besides the standard plights in training deep architectures such as vanishing gradients and overfitting, the training of deep GNNs also uniquely suffers from over-smoothing, information squashing, and so on, which limits their potential power on large-scale graphs. Although numerous efforts are proposed to address these limitations, such as various forms of skip connections, graph normalization, and random dropping, it is difficult to disentangle … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 36 publications
0
3
0
Order By: Relevance
“…We have used the basic vanilla-GCN implementation in PyTorch provided by the authors of [1] to incorporate our proposed techniques and show their effectiveness in making traditional GCN comparable/better with SOTA. For our evaluation on Cora, Citeseer, Pubmed, and OGBN-ArXiv, we have closely followed the data split settings and metrics reported by the recent benchmark [49]. See details in Appendix C. For comparison with SOTA models, we have used JKNet [50], InceptionGCN [51], SGC [52], GAT [3], GCNII [24], and DAGNN [53].…”
Section: Dataset and Experimental Setupmentioning
confidence: 99%
See 1 more Smart Citation
“…We have used the basic vanilla-GCN implementation in PyTorch provided by the authors of [1] to incorporate our proposed techniques and show their effectiveness in making traditional GCN comparable/better with SOTA. For our evaluation on Cora, Citeseer, Pubmed, and OGBN-ArXiv, we have closely followed the data split settings and metrics reported by the recent benchmark [49]. See details in Appendix C. For comparison with SOTA models, we have used JKNet [50], InceptionGCN [51], SGC [52], GAT [3], GCNII [24], and DAGNN [53].…”
Section: Dataset and Experimental Setupmentioning
confidence: 99%
“…We use Adam optimizer for our experiments and performed a grid search to tune hyperparameters for our proposed methods and reported our settings in Table 1. For all our experiments, we have trained our modified GCNs for 1500 epochs and 100 independent repetitions following [49] and reported average performances with the standard deviations of the node classification accuracies. All experiments on large graph datasets, e.g., OGBN-ArXiv, are conducted on single 48G Quadro RTX 8000 GPU, while small graph experiments are completed using a single 16G RTX 5000 GPU.…”
Section: Dataset and Experimental Setupmentioning
confidence: 99%
“…Considering the node classification task in graph analytics, the vanilla training based on cross-entropy minimization often leads to over-confident prediction on the training data and poor generalization to the testing data [51]. It is also reported that the vanilla training of GNNs is sensitive to overfitting [4,26]. These all point to the optimization problems.…”
Section: Why Gnns Generalize Poorlymentioning
confidence: 99%