Drill

Ghorbani, Soudeh; Yang, Zibin; Godfrey, P. Brighten; Ganjali, Yashar; Firoozshahian, Amin

doi:10.1145/3098822.3098839

Cited by 181 publications

(13 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To aviod packet reordering, CAPS [23] encodes the short flows and spreads the packets of short flows to all path. To mitigate the micro-burst at the switch, DRILL [24] picks path for each packet flexibly based on the local queue information. To avoid vigorous rerouting, Hermes [15] reroutes packets in a timely yet cautious manner to good paths only when it will be beneficial.…”

Section: Related Workmentioning

confidence: 99%

Tuning high flow concurrency for MPTCP in data center networks

Huang

et al. 2020

J Cloud Comp

View full text Add to dashboard Cite

In the data center networks, multipath transmission control protocol(MPTCP) uses multiple subflows to balance traffic over parallel paths and achieve high throughput. Despite much recent progress in improving MPTCP performance in data center, how to adjust the number of subflows according to network status has remained elusive. In this paper, we reveal theoretically and empirically that controlling the number of concurrent subflows is very important in reducing flow completion time (FCT) under network dynamic. We further propose a novel design called MPTCP_OPN, which adaptively adjusts the number of concurrent subflows according to the real-time network state and flexibly shifts traffic from congested paths to mitigate the high tail latency. Experimental results show that MPTCP_OPN effectively reduces the timeout probability caused by full window loss and flow completion time by up to 50% compared with MPTCP protocol.

show abstract

Section: Related Workmentioning

confidence: 99%

Tuning high flow concurrency for MPTCP in data center networks

Huang

et al. 2020

J Cloud Comp

View full text Add to dashboard Cite

show abstract

“…In-Network distributed: Distributed load balancing schemes either use local link utilization (Drill [9]) or global congestion state (Conga [2], Hula [15]) to route packets. While local state has difficulties to react to asymmetric links, approaches that require complete global state, e.g.…”

Section: Related Workmentioning

confidence: 99%

Mp-Hula

Benet

Kassler

Benson

et al. 2018

Proceedings of the 2018 Morning Workshop on in-Network Computing

View full text Add to dashboard Cite

Datacenter networks offer a large degree of multipath in order to provide large bisectional bandwidth. The end-to-end performance is determined by the load-balancing strategy which needs to be designed to effectively manage congestion. Consequently, congestion aware load-balancing strategies such as CONGA or HULA have been designed. Recently, more and more applications that are hosted on cloud servers use multipath transport protocols such as MPTCP. However, in the presence of MPTCP, existing loadbalancing schemes including ECMP, HULA or CONGA may lead to suboptimal forwarding decisions where multiple MPTCP subflows of one connection are pinned on the same bottleneck link.In this paper, we present MP-HULA, a transport layer multi-path aware load-balancing scheme using Programmable Data Planes. First, instead of tracking congestion information for the best path towards the destination, each MP-HULA switch tracks congestion information for the best-k paths to a destination through the neighbor switches. Second, we design MP-HULA using Programmable Data Planes, where each leaf switch can identify, using P4, which MPTCP subflow belongs to which connection. MP-HULA then load-balances different MPTCP subflows of a MPTCP connection on different next hops considering congestion state while aggregating bandwidth. Our evaluation shows that MP-HULA with MPTCP outperforms HULA in average flow completion time (2.1x at 50% load, 1.7x at 80% load).

show abstract

“…For example, in terms of controller-based methods, the interaction latency between switches and the controller may be orders of magnitude slower than the speed at which typical datacenter congestion events occur. They also react slowly to microbursts [4]. However, microbursts have been identified as the main culprit of packet loss in DCNs, which leads to retransmissions that impose significant latency and degrade application performance [4,46].…”

mentioning

confidence: 99%

“…They also react slowly to microbursts [4]. However, microbursts have been identified as the main culprit of packet loss in DCNs, which leads to retransmissions that impose significant latency and degrade application performance [4,46].…”

mentioning

confidence: 99%

See 1 more Smart Citation

QALL: Distributed Queue-Behavior-Aware Load Balancing Using Programmable Data Planes

Liu,

Cai,

Ling

et al. 2024

IEEE Trans. Netw. Serv. Manage.

View full text Add to dashboard Cite

Existing load-balancing methods used in data center networks involve some shortcomings such as excessively large decision delays during reactions to microbursts and large overheads involved in active probing. Programmable data planes have provided new opportunities for local decision-making on switches to address these issues. We observe that queue behavior (i.e., queue occupancy, queuing trend, and dequeue time interval) in switches can reflect the current or future congestion degree on a network. Furthermore, following data-driven experiments, we found an accurate fitting function of congestion degree to queue behavior. Thus, we propose an in-network load-balancing scheme based on a programmable switch, called queue-behavior-aware localized load balancing (QALL). In QALL, each switch independently selects egress ports probabilistically according to fine-grained-measured local queue behavior. The key concept of QALL is to take account the evolutionary process of reaching the current queue state into its decision basis for load balancing. Experimental results under actual DCN workloads (including web search and data mining workloads) demonstrate the effectiveness of QALL. In terms of flow completion time, decision delay, network shock, load sharing accuracy, and packet reordering, QALL outperformed recent perpacket (DRILL), per-flowlet (LetFlow and CONGA), and per-flow (ECMP) load balancers, particularly under heavy load. For example, under asymmetrical topology with 90% load level, the flow completion time of QALL was lower than that of ECMP, LetFlow, CONGA, and DRILL by up to 54.7%, 46.5%, 38.9%, and 18.9%, respectively.

show abstract

Drill

Cited by 181 publications

References 39 publications

Tuning high flow concurrency for MPTCP in data center networks

Tuning high flow concurrency for MPTCP in data center networks

Mp-Hula

QALL: Distributed Queue-Behavior-Aware Load Balancing Using Programmable Data Planes

Contact Info

Product

Resources

About