Query-Age-Optimal Scheduling Under Sampling and Transmission Constraints

Zakeri, Abolfazl; Moltafet, Mohammad; Leinonen, Markus; Codreanu, Marian

doi:10.1109/lcomm.2023.3247244

Cited by 4 publications

(4 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An interesting future work would be to consider individual constraints on the average number of transmissions for both the transmitter-relay and relay-destination links. This would lead to a stochastic optimization problem with multiple average constraints, which may be tackled via solution methods developed in [63].…”

Section: Discussionmentioning

confidence: 99%

Minimizing the AoI in Resource-Constrained Multi-Source Relaying Systems: Dynamic and Learning-Based Scheduling

Zakeri

Moltafet

Leinonen

et al. 2024

IEEE Trans. Wireless Commun.

Self Cite

View full text Add to dashboard Cite

We consider a multi-source relaying system where independent sources randomly generate status update packets which are sent to the destination with the aid of a relay through unreliable links. We develop transmission scheduling policies to minimize the weighted sum average age of information (AoI) subject to transmission capacity and long-run average resource constraints. We formulate a stochastic control optimization problem and solve it using a constrained Markov decision process (CMDP) approach and a drift-plus-penalty method. The CMDP problem is solved by transforming it into an MDP problem using the Lagrangian relaxation method. We theoretically analyze the structure of optimal policies for the MDP problem and subsequently propose a structure-aware algorithm that returns a practical near-optimal policy. Using the drift-plus-penalty method, we devise a near-optimal low-complexity policy that performs the scheduling decisions dynamically. We also develop a model-free deep reinforcement learning policy for which the Lyapunov optimization theory and a dueling double deep Qnetwork are employed. The complexities of the proposed policies are analyzed. Simulation results are provided to assess the performance of our policies and validate the theoretical results. The results show up to 91% performance improvement compared to a baseline policy.Index Terms-Age of information (AoI), relay, constrained Markov decision process (CMDP), drift-plus-penalty, deep reinforcement learning.1 This relay could be a static node [10] or a mobile node, e.g., unmanned aerial vehicle (UAV) [26]-[31] or a vehicle in the vehicular communications [32]. For instance, in [30], multiple UAVs serve as mobile relays between the sensors and the base station, and the goal is to optimize the UAVs' trajectories to minimize the average AoI and energy consumption.

show abstract

Section: Discussionmentioning

confidence: 99%

Minimizing the AoI in Resource-Constrained Multi-Source Relaying Systems: Dynamic and Learning-Based Scheduling

Zakeri

Moltafet

Leinonen

et al. 2024

IEEE Trans. Wireless Commun.

Self Cite

View full text Add to dashboard Cite

show abstract

“…At each slot, we aim to find the best command action a(t) that optimizes an average performance metric subject to energy causality constraint (2). Formally, our goal is to solve the following stochastic control problem: minimize lim sup…”

Section: B Performance Metrics and Problem Formulationmentioning

confidence: 99%

“…However, the main difficulty comes from the fact that the state space is infinite. Thus, methods such as RVI and linear programming [19], which are only applicable for problems with a finite state space, cannot be directly utilized. Nonetheless, problem (36) is an MDP problem and can be solved via reinforcement learning algorithms that use approximation methods to approximate either the Q-function or optimal policy directly.…”

Section: B the Age Of Incorrect Information Metricmentioning

confidence: 99%

Semantic-aware Sampling and Transmission in Real-time Tracking Systems: A POMDP Approach

Zakeri,

Moltafet,

Codreanu

2024

Preprint

View full text Add to dashboard Cite

We address the problem of real-time remote tracking of a partially observable Markov source in an energy harvesting system with an unreliable communication channel. We consider both sampling and transmission costs. Different from most prior studies that assume the source is fully observable, the sampling cost renders the source partially observable. The goal is to jointly optimize sampling and transmission policies for two semantic-aware metrics: i) a general distortion measure and ii) the age of incorrect information (AoII). We formulate a stochastic control problem. To solve the problem for each metric, we cast a partially observable Markov decision process (POMDP), which is transformed into a belief MDP. Then, for both AoII under the perfect channel setup and distortion, we express the belief as a function of the age of information (AoI). This expression enables us to effectively truncate the corresponding belief space and formulate a finite-state MDP problem, which is solved using the relative value iteration algorithm. For the AoII metric in the general setup, a deep reinforcement learning policy is proposed to solve the belief MDP problem. Simulation results show the effectiveness of the derived policies and, in particular, reveal a non-monotonic switching-type structure of the real-time optimal policy with respect to AoI.

show abstract

“…However, the main difficulty comes from the fact that the state space of the problem is infinite. Thus, methods such as RVI and linear programming [40], which are only applicable for problems with a finite state space, cannot be directly utilized. Nonetheless, problem (36) is an MDP problem and can be solved via reinforcement learning algorithms that use approximation methods to approximate either the Q-function or optimal policy directly.…”

Section: The Age Of Incorrect Information Metricmentioning

confidence: 99%