This paper studies the demand-capacity balancing (DCB) problem in air traffic flow management (ATFM) with collaborative multi-agent reinforcement learning (MARL). To attempt the proper ground delay for resolving airspace hotspots, a multi-agent asynchronous advantage actor-critic (MAA3C) framework is firstly constructed with the long short-term memory network (LSTM) for the observations, in which the number of agents varies across training steps. The unsupervised learning and supervised learning are then introduced for better collaboration and learning among the agents. Experimental results demonstrate the scalability and generalization of the proposed frameworks, by means of applying the trained models to resolve different simulated and real-world DCB scenarios, with various flights number, sectors number and capacity settings.
The rapidly evolving urban air mobility (UAM) develops the heavy demand for public air transport tasks and poses great challenges to safe and efficient operation in low-altitude urban airspace. In this paper, the operation conflict is managed in the strategic phase with multi-agent reinforcement learning (MARL) in dynamic environments. To enable efficient operation, the aircraft flight performance is integrated into the process of multi-resolution airspace design, trajectory generation, conflict management, and MARL learning. The demand and capacity balancing (DCB) issue, separation conflict, and block unavailability introduced by wind turbulence are resolved by the proposed the multi-agent asynchronous advantage actor-critic (MAA3C) framework, in which the recurrent actor-critic networks allow the automatic action selection between ground delay, speed adjustment, and flight cancellation. The learned parameters in MAA3C are replaced with random values to compare the performance of trained models. Simulated training and test experiments performed on a small urban prototype and various combined use cases suggest the superiority of the MAA3C solution in resolving conflicts with complicated wind fields. And the generalization, scalability, and stability of the model are also demonstrated while applying the model to complex environments.
Tactical conflict management is a crucial issue for time-sensitive urban air mobility (UAM) operations, considering safety, security, and efficiency factors. To achieve real-time conflict resolution in structural UAM corridors, the operational environment is formulated as the graph structure, in which the edge connection is the available routes, and the node feature is collected from the flight states, e.g. arrival time, speed, arrival probability affected by uncertainties, and priority. To resolve the short-term conflict, the graph propagation solution is proposed to generate multiple augmented subgraph views based on the prescribed graph, where each subgraph represents one candidate action, e.g. speed adjustment, local re-routing. Information in each subgraph is then aggregated and assessed by the global cost metric. As the consequence, the final action is determined by ranking the cost values of all possible subgraph views. The study cases about the higher-priority intruder and non-cooperative intruder demonstrate the effectiveness of the proposed solution for eliminating the conflicts and reducing the additional cost.
With the urban air mobility (UAM) quickly evolving, the great demand for public airborne transit and deliveries, besides creating a big market, will result in a series of technical, operational, and safety problems. This paper addresses the strategic conflict issue in low-altitude UAM operations with multi-agent reinforcement learning (MARL). Considering the difference in flight characteristics, the aircraft performance is fully integrated into the design process of strategic deconfliction components. With this concept, the multi-resolution structure for the low-altitude airspace organization, Gaussian Mixture Model (GMM) for the speed profile generation, and dynamic separation minima enable efficient UAM operations. To resolve the demand and capacity balancing (DCB) issue and the separation conflict at the strategic stage, the multi-agent asynchronous advantage actor-critic (MAA3C) framework is built with mask recurrent neural networks (RNNs). Meanwhile, variable agent number, dynamic environments, heterogeneous aircraft performance, and action selection between speed adjustment and ground delay can be well handled. Experiments conducted on a developed prototype and various scenarios indicate the obvious advantages of the constructed MAA3C in minimizing the delay cost and refining speed profiles. And the effectiveness, scalability, and stabilization of the MARL solution are ultimately demonstrated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.