Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
DOI: 10.1145/3620665.3640410
|View full text |Cite
|
Sign up to set email alerts
|

T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives

Suchita Pati,
Shaizeen Aga,
Mahzabeen Islam
et al.

Abstract: Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the number of devices increases. While some distributed techniques can overlap, and thus, hide this communication with independent computations, techniques such as Tensor Parallelism (TP) inherently serialize communication with model execution. One approach to hide this serialized communication is to interleave it with t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
references
References 49 publications
0
0
0
Order By: Relevance