This paper proposes a song accompaniment generation method that combines audio analysis and symbolic music generation so that human music theory can be used to build a reinforcement learning model, training an agent to create music. The key to this algorithm is to extract music theory concepts from audio and a reward model that works well in reinforcement learning. However, some music theory rules are complex and challenging to describe. It is difficult to achieve competitive results only by hardcoding the reward. Therefore, to build an effective reward model, a neural network is used to evaluate the perceptual part of composition quality, and program discrimination is used to model easy-to-describe music theory, and the two work together. Experiments show that the proposed algorithm can generate accompaniment arrangements close to human composers, is compatible with various musical styles, and outperforms the baseline algorithm in multiple evaluation metrics.