Modeling collective behavior is a way to better understand the mechanisms that govern collective animal behaviors. Traditional rule-based modeling methods rely heavily on human prior knowledge and may not provide a proper explanation of the phenomenon of collective behaviors. This paper proposes a Deep Q-Networks (DQN)-based modeling method for fish school. Firstly, an individual's state (continuous value) is expressed by the angle between its direction and the average direction of its perceived neighbors. An individual's action is represented with discretized turning angle. Then, the reward function is constructed with the change in the number of neighbors. And finally, the neural network structure is constructed to represent the Q-value function and is trained by the DQN algorithm. The proposed approach is tested in two scenarios: single-learner and multi-learner. Results show that in both scenarios the proposed method can gradually converge and finally obtain a model that can produce collective behavior. On this basis, this paper also deeply analyzes the learned model from the perspectives of average order parameter and collective behavior patterns. It verifies that the behavior pattern generated by the learned model is a highly ordered collective behavior. In addition, we make a comparison between our proposed approach and the Q-Learning algorithm. The results show that our approach not only has a stronger ability to express policy and is better at handling continuous states but also has a more stable learning performance in training.
INDEX TERMSCollective behavior, collective behavior model, Deep Q-Networks (DQN), fish school