SC16: International Conference for High Performance Computing, Networking, Storage and Analysis 2016
DOI: 10.1109/sc.2016.63
|View full text |Cite
|
Sign up to set email alerts
|

Watch Out for the Bully! Job Interference Study on Dragonfly Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 57 publications
(29 citation statements)
references
References 22 publications
0
29
0
Order By: Relevance
“…Most solutions optimize job allocation to minimize the contention on the links, either on dragonfly networks [39] or on other topologies [11,38]. For example, Yang et al [45,47] show that whereas for communication intensive jobs a random allocation is more beneficial, for less communication-intensive jobs a contiguous allocation is better. Starting from this observation, they propose a hybrid allocation scheme, to allocate communication-intensive jobs randomly whereas the less communication-intensive jobs are allocated on contiguous nodes.…”
Section: Related Workmentioning
confidence: 99%
“…Most solutions optimize job allocation to minimize the contention on the links, either on dragonfly networks [39] or on other topologies [11,38]. For example, Yang et al [45,47] show that whereas for communication intensive jobs a random allocation is more beneficial, for less communication-intensive jobs a contiguous allocation is better. Starting from this observation, they propose a hybrid allocation scheme, to allocate communication-intensive jobs randomly whereas the less communication-intensive jobs are allocated on contiguous nodes.…”
Section: Related Workmentioning
confidence: 99%
“…Earlier studies showed that applications that run on Cray XC40 might experience a high degree of variability in execution time and performance [10,14,28]. (n, p).…”
Section: Measurements Variabilitymentioning
confidence: 99%
“…However, while Aries manages long bouts of congestion better than Gemini does, application runtime variability due to network performance remains a concern [15]. • Detection of long-duration congestion using traffic measurements can facilitate intervention such as rank remapping or rescheduling of bully jobs [6]. The 99.9th percentile congested link duration observed in both systems for P T S th ≤ 20% is greater than a minute.…”
Section: • Impact Of Routing Algorithms On Congestion (See Subsectionmentioning
confidence: 99%