Edge computing has emerged as a paradigm for local computing/processing tasks, reducing the distances over which data transfers are made. Thus, an opportunity is presented for data transfer-intensive, distributed machine learning. In this paper we develop a solution for serving distributed Machine Learning (ML) training jobs at the edgecloud continuum. We model the specific requirements of each ML job, and the features of the edge and cloud resources. Next, we develop an Integer Linear Programming algorithm to perform the resource allocation. We examine different scenarios (different processing and bandwidth costs) and quantify tradeoffs related to performance and cost of edge/cloud bandwidth and processing resources. Our simulations indicate that even though there are many parameters that determine the allocation, the processing costs seem to play on average the most important role. The cloud b/w costs can be significant in certain scenarios. Finally, in certain examined cases, significant monetary benefits can be achieved through the collaboration of both edge and cloud resources when compared to using exclusively edge or cloud resources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.