Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent learning (MAL) scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be addressed. The primary goal of this article is to provide a clear overview of current multiagent deep reinforcement learning (MDRL) literature. Additionally, we complement the overview with a broader analysis: (i) we revisit previous key components, originally presented in MAL and RL, and highlight how they have been adapted to multiagent deep reinforcement learning settings. (ii) We provide general guidelines to new practitioners in the area: describing lessons learned from MDRL works, pointing to recent benchmarks, and outlining open avenues of research. (iii) We take a more critical tone raising practical challenges of MDRL (e.g., implementation and computational demands). We expect this article will help unify and motivate future research to take advantage of the abundant literature that exists (e.g., RL and MAL) in a joint effort to promote fruitful research in the multiagent community.$ Earlier versions of this work had the title: "Is multiagent deep reinforcement learning the answer or the question? A brief survey" arXiv:1810.05587v3 [cs.MA] 30 Aug 2019 Go [14,15], poker [16,17], and games of two competing teams, e.g., DOTA 2 [18] and StarCraft II [19].While different techniques and algorithms were used in the above scenarios, in general, they are all a combination of techniques from two main areas: reinforcement learning (RL) [20] and deep learning [21,22].RL is an area of machine learning where an agent learns by interacting (i.e., taking actions) within a dynamic environment. However, one of the main challenges to RL, and traditional machine learning in general, is the need for manually designing quality features on which to learn. Deep learning enables efficient representation learning, thus allowing the automatic discovery of features [21,22]. In recent years, deep learning has had successes in different areas such as computer vision and natural language processing [21,22]. One of the key aspects of deep learning is the use of neural networks (NNs) that can find compact representations in high-dimensional data [23].In deep reinforcement learning (DRL) [23,24] deep neural networks are trained to approximate the optimal policy and/or the value function. In this way the deep NN, serving as function approximator, enables powerful generalization. One of the key advantages of DRL is that it enables RL to scale to problems with high-dimensional state and action spaces. However, most existing successful DRL applications so far have been on visual domains (e.g., Atari games), and there is still a lot of work to be done for more realistic applications [25,26] with complex dynamics, which are not necessarily vision-based.DRL h...
Triggered by the increased fluctuations of renewable energy sources, the European Commission stated the need for integrated short-term energy markets (e.g., intraday), and recognized the facilitating role that local energy communities could play. In particular, microgrids and energy communities are expected to play a crucial part in guaranteeing the balance between generation and consumption on a local level. Local energy markets empower small players and provide a stepping stone towards fully transactive energy systems. In this paper we evaluate such a fully integrated transactive system by (1) modelling the energy resource management problem of a microgrid under uncertainty considering flexible loads and market participation (solved via two-stage stochastic programming), (2) modelling a wholesale market and a local market, and (3) coupling these elements into an integrated transactive energy simulation. Results under a realistic case study (varying prices and competitiveness of local markets) show the effectiveness of the transactive system resulting in a reduction of up to 75% of the expected costs when local markets and flexibility are considered. This illustrates how local markets can facilitate the trade of energy, thereby increasing the tolerable penetration of renewable resources and facilitating the energy transition.Index Terms-Demand response, local electricity markets, microgrids, transactive energy, smart grids, stochastic optimization. NOTATIONIndices: e energy storage systems (ESSs) i distributed generation (DG) units l, m, s, t, v loads, markets, scenarios, periods, electric vehicles (EVs) Sets and subsets: Ω DG , Ω load set of DG units/loads Ω d DG ,Ω nd DG subset of dispatchable/non-dispatchable DG units Ω curt load ,Ω inte load subset of curtailable/interruptible loads Ω shift load subset of shiftable loads Parameters: C DG generation cost of DG unit (m.u./kWh) C ESS − ,C EV − discharging cost of ESS/EV (m.u./kWh) Ccurt,C inte ,C shift load curtailment/interruption/shift cost (m.u./kWh) C imb grid imbalance cost (m.u./kWh) M P electricity market price (m.u./kWh) Ne, N i , N l number of ESS/DG/loads Nm, Ns, Nv number of markets/scenarios/EVs Pcurt max maximum load reduction of Ω curt load (kW) P DG max/min maximum/minimum power of dispatchable DGs (kW) P DG nd forecast power of non-dispatchable DGs (kW) P ESS/EV + max maximum charge rate of ESSs/EV (kW) P ESS/EV − max maximum discharge rate of ESSs/EV (kW) P ESS max/min maximum/minimum energy capacity of ESSs (kWh) P EV max/min maximum/minimum energy capacity of EVs (kWh) P EV trip forecasted energy demand for EVs' trip (kWh) P load forecasted active power of loads (kW) P offer max/min maximum/minimum energy offer in markets (kW) P shift forecasted power of Ω shift load in T shift (kW) P shift max maximum load shifted of Ω shift load in T shift (kW) T number of periods T shift shift interval of Ω shift load T start shift /T end shift earliest/latest possible period for load shift of Ω shift load η EV + /EV − charging/discharging efficiency of EV...
In contrast to high precision but highly obtrusive sensors, using smartphone sensors for measuring daily behaviours allowed us to quantify behaviour changes, relevant to occupational stress. Furthermore, we have shown that use of transfer learning to select data from close models is a useful approach to improve accuracy in presence of scarce data.
Although Reinforcement Learning (RL) has been one of the most successful approaches for learning in sequential decision making problems, the sample-complexity of RL techniques still represents a major challenge for practical applications. To combat this challenge, whenever a competent policy (e.g., either a legacy system or a human demonstrator) is available, the agent could leverage samples from this policy (advice) to improve sample-efficiency. However, advice is normally limited, hence it should ideally be directed to states where the agent is uncertain on the best action to execute. In this work, we propose Requesting Confidence-Moderated Policy advice (RCMP), an action-advising framework where the agent asks for advice when its epistemic uncertainty is high for a certain state. RCMP takes into account that the advice is limited and might be suboptimal. We also describe a technique to estimate the agent uncertainty by performing minor modifications in standard value-function-based RL methods. Our empirical evaluations show that RCMP performs better than Importance Advising, not receiving advice, and receiving it at random states in Gridworld and Atari Pong scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.