2022 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) 2022
DOI: 10.1109/pdp55904.2022.00026
|View full text |Cite
|
Sign up to set email alerts
|

GraphCL: A Framework for Execution of Data-Flow Graphs on Multi-Device Platforms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…Then, the last step of the profiling phase, Step C, is performed. At this point, the acceleration between both offloading modes is computed, determining the best strategy to be used with the device being profiled: Thus, values higher than 1.0 indicate that the device has a beneficial behavior when facing workload splitting strategies, allowing an increase in throughputs by taking advantage of multiple command queues, overlap between computation and communication as well as appropriate interleaving between management and computation, as demonstrated in previous studies [17,19,[21][22][23][24]. And therefore, values lower than 1.0 indicate that it suffers penalization for device management and chunk synchronization, sharing of CPU usage with the simulator itself or other tasks and even an indication of very short execution times, where the generation of multiple chunks is usually counterproductive.…”
Section: Mash Algorithmmentioning
confidence: 98%
“…Then, the last step of the profiling phase, Step C, is performed. At this point, the acceleration between both offloading modes is computed, determining the best strategy to be used with the device being profiled: Thus, values higher than 1.0 indicate that the device has a beneficial behavior when facing workload splitting strategies, allowing an increase in throughputs by taking advantage of multiple command queues, overlap between computation and communication as well as appropriate interleaving between management and computation, as demonstrated in previous studies [17,19,[21][22][23][24]. And therefore, values lower than 1.0 indicate that it suffers penalization for device management and chunk synchronization, sharing of CPU usage with the simulator itself or other tasks and even an indication of very short execution times, where the generation of multiple chunks is usually counterproductive.…”
Section: Mash Algorithmmentioning
confidence: 98%
“…Due to the complexity of graph structure, traditional contrastive learning methods cannot be easily applied directly to graph data. Graph contrastive learning [54][55][56] methods make similar graphs closer by designing effective graph matching or clustering loss.…”
Section: Contrastive Learningmentioning
confidence: 99%