In power wireless communication network, the traditional network selection algorithm can’t deal with the problem of dynamic selection according to network condition, so this paper proposes a network selection algorithm based on Q-learning. Q-learning algorithm is a model-free reinforcement learning algorithm, which is often applied in the field of wireless communication due to its flexibility and adaptability, and obtains the optimal strategy by the Q-value function estimation of state-action pair. In this paper, we establish the Q-learning network selection model, determine the network status in the analysis of network load and service type, then select the maximum cumulative rewards to the action, with which we can select the optimal network. Since the Q value of the Q-learning algorithm is updated iteratively, the algorithm can adapt to the dynamic network selection. The simulation results show that the network selection algorithm based on Q-learning can effectively reduce the blocking rate of voice traffic and data packet loss rate, and improve the average network throughput.