2021
DOI: 10.1109/tpds.2021.3065737
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Data Loader for Fast Sampling-Based GNN Training on Large Graphs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(13 citation statements)
references
References 23 publications
0
12
0
Order By: Relevance
“…In this section, we review recent methods for improving the efficiency of GNNs, which is regarded as aligning GNN research with social values regarding environmental well-being. Generally, the efficiency improvement is evaluated with reference to time-related metrics (e.g., response latency or speedup rating [257], [258], throughput rating [259], [260], communication time [261]), energy-related metrics (e.g., nodes-per-Joule [36], energy consumption [262]), or resource-related metrics (e.g., memory footprint [72], cache access performance, and peak memory usage [263]). Existing methods include scalable GNN architectures and efficient data communication, model compression methods, efficient frameworks and accelerators.…”
Section: Environmental Well-being Of Gnnsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this section, we review recent methods for improving the efficiency of GNNs, which is regarded as aligning GNN research with social values regarding environmental well-being. Generally, the efficiency improvement is evaluated with reference to time-related metrics (e.g., response latency or speedup rating [257], [258], throughput rating [259], [260], communication time [261]), energy-related metrics (e.g., nodes-per-Joule [36], energy consumption [262]), or resource-related metrics (e.g., memory footprint [72], cache access performance, and peak memory usage [263]). Existing methods include scalable GNN architectures and efficient data communication, model compression methods, efficient frameworks and accelerators.…”
Section: Environmental Well-being Of Gnnsmentioning
confidence: 99%
“…Moreover, some GNNs suffer from inefficient data loading. For example, data loading occupies 74% of the whole training time for GCN [260]. A method called PaGraph [260] analyses pipeline bottlenecks of GNNs and proposes a GPU cache policy to reduce the time consumption associated with moving data from CPU to GPU.…”
Section: Environmental Well-being Of Gnnsmentioning
confidence: 99%
“…Ali-Graph [14] used a novel storage layer to cache the nodes and their intermediate result to reduce communication between local and other processors. Similarly, Bai et al [26] presented an efficient data loader to store frequently accessed nodes in cache by using a novel indexing algorithm which results in speeding up the acquisition of information between processors to reduce communication time. Jiang et al [16] provided different sampling probability for nodes on current processor and other processors.…”
Section: Optimized Distributed Graph Representation Learningmentioning
confidence: 99%
“…For example, GraphSAGE [7] computes the π‘€π‘Žπ‘₯ of the neighboring nodes, while some other models use π‘†π‘’π‘š [29]. The aggregation result is given to a linear function (πΏπ‘–π‘›π‘’π‘Žπ‘Ÿ ) and an activation function (π‘…π‘’πΏπ‘ˆ ) to obtain the intermediate embedding y (1) 𝑖 . The intermediate embeddings are further aggregated for a few layers to obtain the output embeddings.…”
Section: Introductionmentioning
confidence: 99%
“…The idea is to sample a subset of neighbors and estimate the aggregation results based on the sampled nodes. As shown in Figure 2c, instead of computing the accurate value of x (1) 1 with all of x (0)…”
Section: Introductionmentioning
confidence: 99%