This paper applies deep reinforcement learning (DRL) on the synthetic jet control of flows over an NACA (National Advisory Committee for Aeronautics) 0012 airfoil under weak turbulent condition. Based on the proximal policy optimization method, the appropriate strategy for controlling the mass rate of a synthetic jet is successfully obtained at [Formula: see text]. The effectiveness of the DRL based active flow control (AFC) method is first demonstrated by studying the problem with constant inlet velocity, where a remarkable drag reduction of 27.0% and lift enhancement of 27.7% are achieved, accompanied by an elimination of vortex shedding. Then, the complexity of the problem is increased by changing the inlet velocity condition and reward function of the DRL algorithm. In particular, the inlet velocity conditions pulsating at two different frequencies and their combination are further applied, where the airfoil wake becomes more difficult to suppress dynamically and precisely; and the reward function additionally contains the goal of saving the energy consumed by the synergetic jets. After training, the DRL agent still has the ability to find a proper control strategy, where significant drag reduction and lift stabilization are achieved, and the agent with considerable energy saving is able to save the energy consumption of the synergetic jets for 83%. The performance of the DRL based AFC proves the strong ability of DRL to deal with fluid dynamics problems usually showing high nonlinearity and also serves to encourage further investigations on DRL based AFC.
In the interdisciplinary field of data-driven models and computational fluid mechanics, the reduced-order model for flow field prediction is mainly constructed by a convolutional neural network (CNN) in recent years. However, the standard CNN is only applicable to data with Euclidean spatial structure, while data with non-Euclidean properties can only be convolved after pixelization, which usually leads to decreased data accuracy. In this work, a novel data-driven framework based on graph convolution network (GCN) is proposed to allow the convolution operator to predict fluid dynamics on non-uniform structured or unstructured mesh data. This is achieved by the fact that the graph data inherit the spatial characteristics of the mesh and by the message passing mechanism of GCN. The conversion method from the form of mesh data to graph data and the operation mechanism of GCN are clarified. Moreover, additional relevance features and weight loss function of the dataset are also investigated to improve the model performance. The model learns an end-to-end mapping between the mesh spatial features and the physical flow field. Through our studies of various cases of internal flow, it is shown that the proposed GCN-based model offers excellent adaptability to non-uniformly distributed mesh data, while also achieving a high accuracy and three-order speedup compared with numerical simulation. Our framework generalizes the graph convolution network to flow field prediction and opens the door to further extending GCN to most existing data-driven architectures of fluid dynamics in the future.
Deep reinforcement learning (DRL) has gradually emerged as an effective and novel method to achieve active flow control with outstanding performance. This paper focuses on exploring the strategy of improving learning efficiency and control performance of a new task using existing control experience. More specifically, the proximal policy optimization (PPO) algorithm is used to control the flow past a circular cylinder using jets. The DRL controllers trained from the initialized parameter are able to obtain drag reductions of 8%, 18.7%, 18.4%, 25.2%, at Re=100, 200, 300 and 1000, respectively, and it takes more episodes to converge for the cases with higher Reynolds number, due to the increased flow complexity. Furthermore, the agent trained at high Reynolds number shows satisfied control performance when it is applied to the lower Reynolds number cases, which proves a strong correlation between the control policy and the flow patterns between the flows under different conditions. To better utilize the experience of the control policy of the trained agent, the flow control tasks with Re=200, 300 and 1000 are retrained, based on the trained agent at Re=100, 200 and 300, respectively. Our results show that a dramatic enhancement of the learning efficiency can be achieved, that is the number of the training episodes reduces to be less than 20% of the agents trained with random initialization. The great performance of the transfer training method of the DRL agent shows its potential on economizing the training cost and improving control effectiveness, especially for complex control tasks.
This paper investigates the performance of several most popular deep reinforcement learning (DRL) algorithms applied to fluid flow and convective heat transfer systems, providing credible guidance and evaluation on their characteristics and performance. The studied algorithms are selected by considering the popularity, category, and advancement for guaranteeing the significance of the current study. The effectiveness and feasibility of all DRL algorithms are first demonstrated by studying a two-dimensional multi-heat-source cooling problem. Compared with the best manually optimized control, all DRL algorithms can find better control strategies that realize a further temperature reduction of 3–7 K. For problems with complex control objectives and environments, PPO (proximal policy optimization) shows an outstanding performance that accurately and dynamically constrains the oscillation of the solid temperature within 0.5 K around the target value, which is far beyond the capability of the manually optimized control. With the presented performance and the supplemented generalization test, the characteristic and specialty of the DRL algorithms are analyzed. The value-based methods have better training efficiency on simple cooling tasks with linear reward, while the policy-based methods show remarkable convergence on demanding tasks with nonlinear reward. Among the algorithms studied, the single-step PPO and prioritized experience replay deep Q-networks should be highlighted: the former has the advantage of considering multiple control targets and the latter obtains the best result in all generalization testing tasks. In addition, randomly resetting the environment is confirmed to be indispensable for the trained agent executing long-term control, which is strongly recommended to be included in follow-up studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.