In this work, a closed-loop identification method based on a reinforcement learning algorithm is proposed for multiple-input multiple-output (MIMO) systems. This method could be an attractive alternative solution to the problem that the current frequency-domain identification algorithms are usually dependent on the attenuation factor. With this method, after continuously interacting with the environment, the optimal attenuation factor can be identified by the continuous action reinforcement learning automata (CARLA), and then the corresponding parameters could be estimated in the end. Moreover, the proposed method could be applied to time-varying systems online due to its online learning ability. The simulation results suggest that the presented approach can meet the requirement of identification accuracy in both square and non-square systems.