Interconnection network performance is a key issue in HPC systems and datacenters, especially as their number of end nodes grows, to cope with application needs. The network topology and the routing algorithm are important factors for performance and cost. Topologies such as fat-tree or Dragonfly were proposed to maximize network performance while reducing network resources.One of the most promising topologies is Slim Fly, which offers high network bandwidth assuring low network diameter. However, adversarial traffic and/or congestion situations may degrade Slim Fly's performance dramatically. Non-minimal routings, such as Valiant or UGAL, can mitigate the former problem while queuing schemes can handle the latter one. In this paper, we proposed a combined mechanism to provide Slim Fly network with both non-minimal routing and queuing schemes by using several virtual networks to guarantee deadlock freedom. Each virtual network consists of a set of virtual channels to store packets separately according to a mapping policy. This diminishes the interaction among traffic flows, thus reducing head-of-line blocking.The results obtained from a simulation-based evaluation show that our proposal enhances the performance in all the traffic cases, in contrast to other mechanisms whose performance drops in certain scenarios.
KEYWORDScongestion management, deadlock freedom, high-performance interconnection networks, HoL blocking, non-minimal routing, Slim Fly topology
MOTIVATIONHigh-Performance Computing (HPC) systems and datacenters are growing in size since application needs of computing power and storage are increasing significantly. This means that the number and complexity of processing and/or storage nodes of these systems will increase as well. TheTop500 list 1 shows this trend, and its tendency is likely to continue in the near future. In this context, if the interconnection network is unable to meet the communication requirements of the applications, it may become the system bottleneck, thus degrading the overall system performance.Overdimensioning the network is an expensive option that may be unaffordable for some network operators. Hence, network designers try to minimize the number of network elements without losing performance. Numerous network topologies have been proposed in the last years to obtain a high performance/cost ratio. Among these topologies with a high performance/cost ratio, the Slim Fly topology 2 takes advantage of graph theory to connect switches guaranteeing a network diameter of two. Although the Slim Fly topology uses fewer switches than other topologies, 2 such as Dragonfly, 3 Flattened Butterfly, 4 or fat-trees, 5 it requires switches with a higher number of ports for networks with similar number of end nodes.Nevertheless, high-radix switches are currently available in the market. Abbreviations: HoL blocking, head-of-line blocking; VC, virtual channel; VN, virtual network; VOQ, virtual output queuing. Concurrency Computat Pract Exper. 2019;31:e4441. wileyonlinelibrary.com/journal/cpeSlim F...