Waves in the oceans are one of the most significant renewable energy sources and are an excellent resource to tackle climate challenges through decarbonizing energy generation. Lowering the Levelized Cost of Energy (LCOE) for energy generation from ocean waves is critical for competitiveness with other forms of clean energy like wind and solar. It requires complex controllers to maximize efficiency for state-of-the-art multi-generator industrial Wave Energy Converters (WEC), which optimizes the reactive forces of the generators on multiple legs of WEC. This paper introduces Multi-Agent Reinforcement Learning controller (MARL) architectures that can handle these various objectives for LCOE. MARL can help increase energy capture efficiency to boost revenue, reduce structural stress to limit maintenance cost, and adaptively and proactively protect the wave energy converter from catastrophic weather events preserving investments and lowering effective capital cost. These architectures include 2-agent and 3-agent MARL implementing proximal policy optimization (PPO) with various optimizations to help sustain the training convergence in the complex hyperplane without falling off the cliff. Also, the design for trust assures the operation of WEC within a safe zone of mechanical compliance. As a part of this design, reward shaping for multiple objectives of energy capture and penalty for harmful motions minimizes stress and lowers the cost of maintenance. We achieved double-digit gains in energy capture efficiency across the waves of different principal frequencies over the baseline Spring Damper controller with the proposed MARL controllers.
The industrial multi-generator Wave Energy Converters (WEC) must handle multiple simultaneous waves coming from different directions called spread waves. These complex devices in challenging circumstances need controllers with multiple objectives of energy capture efficiency, reduction of structural stress to limit maintenance, and proactive protection against high waves. The Multi-Agent Reinforcement Learning (MARL) controller trained with Proximal Policy Optimization (PPO) algorithm can handle these complexities. In this paper, we explore different function approximations for the policy and critic networks in modeling the sequential nature of the system dynamics and find that they are key to better performance. We investigated the performance of a fully connected neural network (FCN), LSTM, and Transformer model variants with varying depths and gated residual connections. Our results show that the transformer model of moderate depth with gated residual connections around the multi-head attention, multi-layer perceptron, and the transformer block (STrXL) proposed in this paper is optimal and boosts energy efficiency by an average of 22.1% for these complex spread waves over the existing spring damper (SD) controller. Furthermore, unlike the default SD controller, the transformer controller almost eliminated the mechanical stress from the rotational yaw motion for angled waves. Demo: https://tinyurl.com/yueda3jh
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.