Indoor robot global path planning needs to complete the motion between the starting point and the target point according to robot position command transmitted by the wireless network. Behavior dynamics and rolling windows in global path planning methods have limitations in their applications because the path may not be optimal, there could be a pseudo attractor or blind search in an environment with a large state space, there could be an environment where offline learning is not applicable to real-time changes, or there could be a need to set the probability of selecting the robot action. To solve these problems, we propose a behavior dynamics and rolling windows approach to a path planning which is based on online reinforcement learning. It applies Q learning to optimize the behavior dynamics model parameters to improve the performance, behavior dynamics guides the learning process of Q learning and improves learning efficiency, and each round of intensive learning action selection knowledge is gradually corrected as the Q table is updated. The learning process is optimized. The simulation results show that this method has achieved remarkable improvement in path planning. And, in the actual experiment, the robot obtains the target location information by wireless network, and plans an optimized and smooth global path online.