Deep reinforcement learning based synthetic jet control on disturbed flow over airfoil

Wang, Yi-Zhe; Mei, Yufei; Aubry, Nadine; Chen, Zhihua; Wu, Peng; Wu, Wei‐Tao

doi:10.1063/5.0080922

Cited by 44 publications

(10 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Proximal policy optimisation was used by to control a low-Reynolds-number cylinder wake using surface-mounted jets. Wang et al (2022) also resorted to PPO to control a low Reynolds confined bidimensional airfoil flow using three suction-side synthetic jets to reduce drag. In this paper, a variant named proximal policy optimisation with covariance matrix adaptation (PPO-CMA) (Hämäläinen et al 2020) is used as the base learning algorithm.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Reinforcement-learning-based actuator selection method for active flow control

2023

View full text Add to dashboard Cite

This paper addresses the issue of actuator selection for active flow control by proposing a novel method built on top of a reinforcement learning agent. Starting from a pre-trained agent using numerous actuators, the algorithm estimates the impact of a potential actuator removal on the value function, indicating the agent's performance. It is applied to two test cases, the one-dimensional Kuramoto–Sivashinsky equation and a laminar bidimensional flow around an airfoil at $Re=1000$ for different angles of attack ranging from $12^{\circ }$ to $20^{\circ }$ , to demonstrate its capabilities and limits. The proposed actuator-sparsification method relies on a sequential elimination of the least relevant action components, starting from a fully developed layout. The relevancy of each component is evaluated using metrics based on the value function. Results show that, while still being limited by this intrinsic elimination paradigm (i.e. the sequential elimination), actuator patterns and obtained policies demonstrate relevant performances and allow us to draw an accurate approximation of the Pareto front of performances versus actuator budget.

show abstract

Section: Introductionmentioning

confidence: 99%

“…Wang et al. (2022) also resorted to PPO to control a low Reynolds confined bidimensional airfoil flow using three suction-side synthetic jets to reduce drag. In this paper, a variant named proximal policy optimisation with covariance matrix adaptation (PPO-CMA) (Hämäläinen et al.…”

Section: Introductionmentioning

confidence: 99%

Reinforcement-learning-based actuator selection method for active flow control

2023

View full text Add to dashboard Cite

show abstract

“…In recent years, RL has been applied in fluids simulations to reduce the drag experienced in flow around a cylinder [23,24,25], to optimize jets on an airfoil [26], and to find efficient swimming strategies [27]. RL has even been applied to experimental flow systems [28].…”

Section: Introductionmentioning

confidence: 99%

Turbulence control in plane Couette flow using low-dimensional neural ODE-based models and deep reinforcement learning

Linot¹,

Zeng²,

Graham³

2023

Preprint

View full text Add to dashboard Cite

The high dimensionality and complex dynamics of turbulent flows remain an obstacle to the discovery and implementation of control strategies. Deep reinforcement learning (RL) is a promising avenue for overcoming these obstacles, but requires a training phase in which the RL agent iteratively interacts with the flow environment to learn a control policy, which can be prohibitively expensive when the environment involves slow experiments or large-scale simulations. We overcome this challenge using a framework we call "DManD-RL" (datadriven manifold dynamics-RL), which generates a data-driven low-dimensional model of our system that we use for RL training. With this approach, we seek to minimize drag in a direct numerical simulation (DNS) of a turbulent minimal flow unit of plane Couette flow at Re = 400 using two slot jets on one wall. We obtain, from DNS data with O(10 5 ) degrees of freedom, a 25-dimensional DManD model of the dynamics by combining an autoencoder and neural ordinary differential equation. Using this model as the environment, we train an RL control agent, yielding a 440-fold speedup over training on the DNS, with equivalent control performance. The agent learns a policy that laminarizes 84% of unseen DNS test trajectories within 900 time units, significantly outperforming classical opposition control (58%), despite the actuation authority being much more restricted. The agent often achieves laminarization through a counterintuitive strategy that drives the formation of two low-speed streaks, with a spanwise wavelength that is too small to be self-sustaining. The agent demonstrates the same performance when we limit observations to wall shear rate.

show abstract

“…In contrast, RL does not need the 'correct' strategy as supervisory information but generates its own data by exploring and evaluating actions against a reward function 34 . Benefiting from its capacity of modelling policies and value functions in complex RL tasks with continuous state and action space, deep reinforcement learning (DRL) which combines RL and deep learning has been applied to automatically perform AFC strategies [35][36][37][38] . In DRL, an RL agent samples action-state pairs through interacting with an environment and adopts ANNs as function approximators to estimate a value function or a policy from the sampled histories.…”

Section: Introductionmentioning

confidence: 99%

Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy

2022

View full text Add to dashboard Cite

Classical active flow control (AFC) methods based on solving the Navier-Stocks equations are laborious and computationally intensive even with the use of reduced-order models. Data-driven methods offer a promising alternative for AFC and have been applied successfully to reduce the drag of two-dimensional bluff bodies using deep reinforcement learning (DRL) paradigms. However, the standard DRL method tends to result in large fluctuations in the unsteady forces acting on the cylinder as the Reynolds number increases. In this study, a Markov decision process (MDP) with time delays is introduced to quantify the action delays in the DRL process along with the use of a first-order autoregressive policy (ARP). This hybrid DRL method is applied to control the vortex shedding process from a two-dimensional circular cylinder with four synthetic jet actuators at a freestream Reynolds number of 400. Compared to the standard DRL method, this method is shown to reduce the magnitude of drag and lift fluctuations by approximately 90% at the same actuation frequency while achieving a similar level of drag reduction. Although both methods suppress the vortex-induced forces by extending the recirculation zone behind the circular cylinder, the hybrid method leads to a steadier and more elongated recirculation zone, and hence a weaker vortex shedding process. This study demonstrates the necessity of accurately quantifying the delay in the MDP and the benefits of introducing ARPs to achieve a smooth control of unsteady forces in AFC.

show abstract

Deep reinforcement learning based synthetic jet control on disturbed flow over airfoil

Cited by 44 publications

References 33 publications

Reinforcement-learning-based actuator selection method for active flow control

Reinforcement-learning-based actuator selection method for active flow control

Turbulence control in plane Couette flow using low-dimensional neural ODE-based models and deep reinforcement learning

Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy

Contact Info

Product

Resources

About