Optimizing Pipelined Computation and Communication for Latency-Constrained Edge Learning

Skatchkovsky, Nicolas; Simeone, Osvaldo

doi:10.1109/lcomm.2019.2922658

Cited by 16 publications

(13 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Popular traditional methods for distributed inference in the online learning context use incremental updates among nodes [1]- [3], [36]- [38]. In recent years, other stochastic gradient-descent (SGD) based methods were developed for federated learning [36], [37]. While these methods do not require prior knowledge of the observation distributions, they use orthogonal channels among node transmissions, which results in high bandwidth and energy consumption.…”

Section: Related Workmentioning

confidence: 99%

On Analog Gradient Descent Learning Over Multiple Access Fading Channels

Sery

Cohen

2020

IEEE Trans. Signal Process.

149

114

View full text Add to dashboard Cite

We consider a distributed learning problem over multiple access channel (MAC) using a large wireless network. The computation is made by the network edge and is based on received data from a large number of distributed nodes which transmit over a noisy fading MAC. The objective function is a sum of the nodes' local loss functions. This problem has attracted a growing interest in distributed sensing systems, and more recently in federated learning. We develop a novel Gradient-Based Multiple Access (GBMA) algorithm to solve the distributed learning problem over MAC. Specifically, the nodes transmit an analog function of the local gradient using common shaping waveforms and the network edge receives a superposition of the analog transmitted signals used for updating the estimate. GBMA does not require power control or beamforming to cancel the fading effect as in other algorithms, and operates directly with noisy distorted gradients. We analyze the performance of GBMA theoretically, and prove that it can approach the convergence rate of the centralized gradient descent (GD) algorithm in large networks. Specifically, we establish a finite-sample bound of the error for both convex and strongly convex loss functions with Lipschitz gradient. Furthermore, we provide energy scaling laws for approaching the centralized convergence rate as the number of nodes increases. Finally, experimental results support the theoretical findings, and demonstrate strong performance of GBMA using synthetic and real data.Tomer Sery is with the

show abstract

Section: Related Workmentioning

confidence: 99%

On Analog Gradient Descent Learning Over Multiple Access Fading Channels

Sery

Cohen

2020

IEEE Trans. Signal Process.

149

114

View full text Add to dashboard Cite

show abstract

“…Other recent general surveys can be found in [21]- [23]. Going into more specific contributions, the authors of [24] consider an edge machine learning system, where an edge processor runs an algorithm based on Stochastic Gradient Descent (SGD). In particular, they investigate the trade-off between latency and accuracy, by optimizing the packet payload size, given the overhead of each data packet transmission and the ratio between the computation and communication rates.…”

Section: A Related Workmentioning

confidence: 99%

“…Using (24), the evolution of the virtual queue Y k (t) can be written as (14), replacing the closed form expression G k (t) with G k (t). Using G k (t) is useful for the virtual queue's update, but it is not directly related to the number of quantization bits, which affect the learning accuracy.…”

Section: Data-driven Control Of Learning Accuracymentioning

confidence: 99%

“…However, more general designs for G k (n q k ) can be exploited if some information on the shape of G k is known in advance or inferred from data. Then, at a given time slot t, substituting G k with G k (n q k ) in (21), we solve the following deterministic sub-problem for the data rate and quantization bits selection: (14) are updated using the online accuracy estimate G k (t) given by (24). The Algorithm 4 Data-driven Edge Machine Learning Set the Lyapunov trade-off parameter V , Z k (0), Y k (0), ν k .…”

Section: Data-driven Control Of Learning Accuracymentioning

confidence: 99%

See 1 more Smart Citation

Wireless Edge Machine Learning: Resource Allocation and Trade-Offs

2021

View full text Add to dashboard Cite

The aim of this paper is to propose a resource allocation strategy for dynamic training and inference of machine learning tasks at the edge of the wireless network, with the goal of exploring the trade-off between energy, delay and learning accuracy. The scenario of interest is composed of a set of devices sending a continuous flow of data to an edge server that extracts relevant information running online learning algorithms, within the emerging framework known as Edge Machine Learning (EML). Taking into account the limitations of the edge servers, with respect to a cloud, and the scarcity of resources of mobile devices, we focus on the efficient allocation of radio (e.g., data rate, quantization) and computation (e.g., CPU scheduling) resources, to strike the best trade-off between energy consumption and quality of the EML service, including service end-to-end (E2E) delay and accuracy of the learning task. To this aim, we propose two different dynamic strategies: (i) The first method aims to minimize the system energy consumption, under constraints on E2E service delay and accuracy; (ii) the second method aims to optimize the learning accuracy, while guaranteeing an E2E delay and a bounded average energy consumption. Then, we present a dynamic resource allocation framework for EML based on stochastic Lyapunov optimization. Our low-complexity algorithms do not require any prior knowledge on the statistics of wireless channels, data arrivals, and data probability distributions. Furthermore, our strategies can incorporate prior knowledge regarding the model underlying the observed data, or can work in a totally data-driven fashion. Several numerical results on synthetic and real data assess the performance of the proposed approach. INDEX TERMS Edge machine learning, multi-access edge computing, computation offloading, stochastic optimization, resource allocation, energy-latency-accuracy trade-off.

show abstract

“…In the edge learning framework, having in mind the goal of learning and the resources dedicated to that goal, several trade-offs are possible, like the trade-off between power consumption and delay, between accuracy and delay, etc. The authors of [61] consider an edge machine learning system, where an edge processor runs an algorithm based on stochastic gradient descent (SGD), to reach a trade-off between latency and accuracy, by optimizing the packet payload size, given the overhead of each data packet transmission and the ratio between the computation and the communication rates. In [62], the authors proposed an algorithm to maximize the learning accuracy under latency constraints.…”

Section: Goal-oriented Communicationmentioning

confidence: 99%

6G networks: Beyond Shannon towards semantic and goal-oriented communications

2021

View full text Add to dashboard Cite

The goal of this paper is to promote the idea that including semantic and goal-oriented aspects in future 6G networks can produce a significant leap forward in terms of system effectiveness and sustainability. Semantic communication goes beyond the common Shannon paradigm of guaranteeing the correct reception of each single transmitted bit, irrespective of the meaning conveyed by the transmitted bits. The idea is that, whenever communication occurs to convey meaning or to accomplish a goal, what really matters is the impact that the received bits have on the interpretation of the meaning intended by the transmitter or on the accomplishment of a common goal. Focusing on semantic and goal-oriented aspects, and possibly combining them, helps to identify the relevant information, i.e. the information strictly necessary to recover the meaning intended by the transmitter or to accomplish a goal. Combining knowledge representation and reasoning tools with machine learning algorithms paves the way to build semantic learning strategies enabling current machine learning algorithms to achieve better interpretation capabilities and contrast adversarial attacks. 6G semantic networks can bring semantic learning mechanisms at the edge of the network and, at the same time, semantic learning can help 6G networks to improve their efficiency and sustainability.

show abstract

Optimizing Pipelined Computation and Communication for Latency-Constrained Edge Learning

Cited by 16 publications

References 9 publications

On Analog Gradient Descent Learning Over Multiple Access Fading Channels

On Analog Gradient Descent Learning Over Multiple Access Fading Channels

Wireless Edge Machine Learning: Resource Allocation and Trade-Offs

6G networks: Beyond Shannon towards semantic and goal-oriented communications

Contact Info

Product

Resources

About