Multiagent Reinforcement Learning Methods to Resolve Demand Capacity Balance Problems

Spatharis, Christos; Kravaris, Theocharis; Vouros, George A.; Blekas, Konstantinos; Chalkiadakis, Georgios; García, J.M. Cordero; Fernández, Esther Calvo

doi:10.1145/3200947.3201010

Cited by 16 publications

(9 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…All the agents aim at maximizing the expected discounted return E G i st . According to (5), all the agents have the same objective.…”

Section: B Pomdpmentioning

confidence: 99%

“…The demand-capacity imbalance could be constructed as the interacted networks of flight trajectories, in which agents with interactions were defined as "peers" and the connection of "peers" neighbourhood promoted the information propagation. Independent reinforcement learning, edge-based multi-agent reinforcement learning and agentbased multi-agent learning were proposed according to the features of agents' coordination graph [5]. The hierarchical reinforcement learning frameworks were proposed based on the state-action abstraction and temporal action abstraction by taking advantage of the coordination of agents to handle real-world problems.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Integrated Frameworks of Unsupervised, Supervised and Reinforcement Learning for Solving Air Traffic Flow Management Problem

Huang

2021

2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC)

View full text Add to dashboard Cite

This paper studies the demand-capacity balancing (DCB) problem in air traffic flow management (ATFM) with collaborative multi-agent reinforcement learning (MARL). To attempt the proper ground delay for resolving airspace hotspots, a multi-agent asynchronous advantage actor-critic (MAA3C) framework is firstly constructed with the long short-term memory network (LSTM) for the observations, in which the number of agents varies across training steps. The unsupervised learning and supervised learning are then introduced for better collaboration and learning among the agents. Experimental results demonstrate the scalability and generalization of the proposed frameworks, by means of applying the trained models to resolve different simulated and real-world DCB scenarios, with various flights number, sectors number and capacity settings.

show abstract

“…All the agents aim at maximizing the expected discounted return E G i st . According to (5), all the agents have the same objective.…”

Section: B Pomdpmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Integrated Frameworks of Unsupervised, Supervised and Reinforcement Learning for Solving Air Traffic Flow Management Problem

Huang

2021

2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC)

View full text Add to dashboard Cite

show abstract

“…The system can be constructed with the network structure, in which agents with interactions are defined as "peers" and connected for information propagation. With this definition, the edge-based and agent-based reinforcement learning leverage the coordination graph to solve the DCB issue [13]. And to enable collaboration among multiple agents, the hierarchical reinforcement learning formulates state-action abstraction and temporal action abstraction to resolve the congestion issue [14].…”

Section: Introductionmentioning

confidence: 99%

Strategic Conflict Management for Performance-based Urban Air Mobility Operations with Multi-agent Reinforcement Learning

Huang

Petrunin

Tsourdos

2022

2022 International Conference on Unmanned Aircraft Systems (ICUAS)

View full text Add to dashboard Cite

With the urban air mobility (UAM) quickly evolving, the great demand for public airborne transit and deliveries, besides creating a big market, will result in a series of technical, operational, and safety problems. This paper addresses the strategic conflict issue in low-altitude UAM operations with multi-agent reinforcement learning (MARL). Considering the difference in flight characteristics, the aircraft performance is fully integrated into the design process of strategic deconfliction components. With this concept, the multi-resolution structure for the low-altitude airspace organization, Gaussian Mixture Model (GMM) for the speed profile generation, and dynamic separation minima enable efficient UAM operations. To resolve the demand and capacity balancing (DCB) issue and the separation conflict at the strategic stage, the multi-agent asynchronous advantage actor-critic (MAA3C) framework is built with mask recurrent neural networks (RNNs). Meanwhile, variable agent number, dynamic environments, heterogeneous aircraft performance, and action selection between speed adjustment and ground delay can be well handled. Experiments conducted on a developed prototype and various scenarios indicate the obvious advantages of the constructed MAA3C in minimizing the delay cost and refining speed profiles. And the effectiveness, scalability, and stabilization of the MARL solution are ultimately demonstrated.

show abstract

“…For example, in the air traffic simulator FACET [11], some specific location points in the two-dimensional space were taken as agents, training which to decide the safety separation between the passing aircraft [12], or each aircraft was used as an agent to train itself to allocate an appropriate delay based on GDP. In this mode, scholars have explored many MARL frameworks, such as edge-based, agent-based, hierarchical MARL framework [13], [14].…”

Section: Introductionmentioning

confidence: 99%

Demand and Capacity Balancing Technology Based on Multi-agent Reinforcement Learning

Chen

et al. 2021

2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC)

View full text Add to dashboard Cite

To effectively solve Demand and Capacity Balancing (DCB) in large-scale and high-density scenarios through the Ground Delay Program (GDP) in the pre-tactical stage, a sequential decision-making framework based on a time window is proposed. On this basis, the problem is transformed into Markov Decision Process (MDP) based on local observation, and then Multi-Agent Reinforcement Learning (MARL) method is adopted. Each flight is regarded as an independent agent to decide whether to implement GDP according to its local state observation. By designing the reward function in multiple combinations, a Mixed Competition and Cooperation (MCC) mode considering fairness is formed among agents. To improve the efficiency of MARL, we use the double Q-Learning Network (DQN), experience replay technology, adaptive ǫ-greedy strategy and Decentralized Training with Decentralized Execution (DTDE) framework. The experimental results show that the training process of the MARL method is convergent, efficient and stable. Compared with the Computer-Assisted Slot Allocation (CASA) method used in the actual operation, the number of flight delays and the average delay time is reduced by 33.7% and 36.7% respectively.

show abstract

Multiagent Reinforcement Learning Methods to Resolve Demand Capacity Balance Problems

Cited by 16 publications

References 10 publications

Integrated Frameworks of Unsupervised, Supervised and Reinforcement Learning for Solving Air Traffic Flow Management Problem

Integrated Frameworks of Unsupervised, Supervised and Reinforcement Learning for Solving Air Traffic Flow Management Problem

Strategic Conflict Management for Performance-based Urban Air Mobility Operations with Multi-agent Reinforcement Learning

Demand and Capacity Balancing Technology Based on Multi-agent Reinforcement Learning

Contact Info

Product

Resources

About