“…Whereas most of the attention in deep RL has focused on game theory (e.g., to solve Atari games [14]), the same principles can be used to solve path planning and trajectory optimization. In our previous study [15], we showed that the RL agent 1 can learn a policy with a performance comparable to the analytically derived optimal trajectory. This approach highlighted the potential for tracking marine animals by autonomous underwater vehicles and could enable coordinated fleets of vehicles to localize 1 Data and materials availability: The range-only target localization algorithms with deep RL are available on GitHub: github.com/imasmitja/RLforUTracking and track a set of underwater assets via multi-agent, multitarget approaches that are currently intractable with existing methodologies.…”