Reinforcement learning (RL) is a promising datadriven approach for adaptive traffic signal control (ATSC) in complex urban traffic networks, and deep neural networks further enhance its learning power. However, centralized RL is infeasible for large-scale ATSC due to the extremely high dimension of the joint action space. Multi-agent RL (MARL) overcomes the scalability issue by distributing the global control to each local RL agent, but it introduces new challenges: now the environment becomes partially observable from the viewpoint of each local agent due to limited communication among agents. Most existing studies in MARL focus on designing efficient communication and coordination among traditional Q-learning agents. This paper presents, for the first time, a fully scalable and decentralized MARL algorithm for the state-of-the-art deep RL agent: advantage actor critic (A2C), within the context of ATSC. In particular, two methods are proposed to stabilize the learning procedure, by improving the observability and reducing the learning difficulty of each local agent. The proposed multi-agent A2C is compared against independent A2C and independent Q-learning algorithms, in both a large synthetic traffic grid and a large real-world traffic network of Monaco city, under simulated peak-hour traffic dynamics. Results demonstrate its optimality, robustness, and sample efficiency over other state-ofthe-art decentralized MARL algorithms.
A soft, stretchable, and fully enclosed self-charging power system is developed by seamlessly combining a stretchable triboelectric nanogenerator with stretchable supercapacitors, which can be subject to and harvest energy from almost all kinds of large-degree deformation due to its fully soft structure. The power system is washable and waterproof owing to its fully enclosed structure and hydrophobic property of its exterior surface. The power system can be worn on the human body to effectively scavenge energy from various kinds of human motion, and it is demonstrated that the wearable power source is able to drive an electronic watch. This work provides a feasible approach to design stretchable, wearable power sources and electronics.
In this article, we study multiple attribute decision-making (MADM) problems with picture fuzzy numbers (PFNs) information. Afterwards, we adopt a Muirhead mean (MM) operator, a weighted MM (WMM) operator, a dual MM (DMM) operator, and a weighted DMM (WDMM) operator to define some picture fuzzy aggregation operators, including the picture fuzzy MM (PFMM) operator, the picture fuzzy WMM (PFWMM) operator, the picture fuzzy DMM (PFDMM) operator, and the picture fuzzy WDMM (PFWDMM) operator. Of course, the precious merits of these defined operators are investigated. Moreover, we have adopted the PFWMM and PFWDMM operators to build a decision-making model to handle picture fuzzy MADM problems. In the end, we take a concrete instance of appraising a financial investment risk to demonstrate our defined model and to verify its accuracy and scientific merit.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.