Many data center applications nowadays rely on distributed computation models like MapReduce and Bulk Synchronous Parallel (BSP) for data-intensive computation at scale [4]. These models scale by leveraging the partition/aggregate pattern where data and computations are distributed across many worker servers, each performing part of the computation. A communication phase is needed each time workers need to synchronize the computation and, at last, to produce the final output. In these applications, the network communication costs can be one of the dominant scalability bottlenecks especially in case of multi-stage or iterative computations [1].The advent of flexible networking hardware and expressive data plane programming languages have produced networks that are deeply programmable [2]. This creates the opportunity to co-design distributed systems with their network layer, which can offer substantial performance benefits. A possible use of this emerging technology is to execute the logic traditionally associated with the application layer into the network itself. Given that in the above mentioned applications the intermediate results are necessarily exchanged through the network, it is desirable to offload to it part of the aggregation task to reduce the traffic and lessen the work of the servers. However, these programmable networking devices typically have very stringent constraints on the number and type of operations that can be performed at line rate. Moreover, packet processing at high speed requires a very fast memory, such as TCAM or SRAM, which is expensive and usually available in small capacities.
THE DAIET APPROACHIn this work, we propose DAIET, a system for data aggregation in-network. DAIET leverages the programmable data plane to reduce the traffic as it is being forwarded towards the destination by * Amedeo Sapio is also with Politecnico di Torino.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). SoCC '17, September 24-27, 2017, Santa Clara, CA, USA © 2017 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-5028-0/17/09. https://doi.org/10.1145/3127479.3132018 opportunistically offloading the aggregation task to the network. In many distributed algorithms, the aggregation function is typically commutative and associative. Therefore, each network device can independently aggregate part of the data without affecting the correctness of the result. Moreover, the destination workers remain in charge of the portion of the aggregation task that is not handled by the network.Since these applications typically exchange the intermediate results with many-to-one communications, DAIET models this pattern using several in-network aggregation trees, where the r...