The aerodynamic design of modern civil aircraft requires a true sense of intelligence since it requires a good understanding of transonic aerodynamics and sufficient experience.Reinforcement learning is an artificial general intelligence that can learn sophisticated skills by trial-and-error, rather than simply extracting features or making predictions from data.The present paper utilizes a deep reinforcement learning algorithm to learn the policy for reducing the aerodynamic drag of supercritical airfoils. The policy is designed to take actions based on features of the wall Mach number distribution so that the learned policy can be more general. The initial policy for reinforcement learning is pretrained through imitation learning, and the result is compared with randomly generated initial policies. The policy is then trained in environments based on surrogate models, of which the mean drag reduction of 200 airfoils can be effectively improved by reinforcement learning. The policy is also tested by multiple airfoils in different flow conditions using computational fluid dynamics calculations. The results show that the policy is effective in both the training condition and other similar conditions, and the policy can be applied repeatedly to achieve greater drag reduction.