When the target localization algorithm based on reinforcement learning is trained on few-sample data sets, the accuracy of target localization is low due to the low degree of fitting. Therefore, on the basis of deep reinforcement learning target localization algorithm, this paper proposes a target localization algorithm based on meta-reinforcement learning. Firstly, during the initial training of the model, the meta-parameters were classified and stored according to the similarity of the training tasks. Then, for the new target location task, the task feature extraction was carried out and the meta parameters with the highest similarity were matched as the initial parameters of the model training. The model dynamically updated the meta parameter pool to ensure that the optimal meta parameters of multiple different types of features were saved in the meta parameter pool, so as to improve the generalization ability and recognition accuracy of multiple types of target location tasks. Experimental results show that in a variety of single target localization tasks, compared with the original reinforcement learning target localization algorithm, under the same data set size, the model converges under a small number of training steps with the meta-parameters in the matching meta-parameter pool as the initial training parameters. Moreover, the training speed of the meta-reinforcement learning method based on MAML-RL is increased by 28.2% for random initial parameters, and that of the meta-reinforcement learning method based on this paper is increased by 34.9%, indicating that the proposed algorithm effectively improves the training speed, generalization performance and localization accuracy of object detection.