Pan–tilt–zoom cameras are commonly used for surveillance applications. Their automation could reduce the workload of human operators and increase the safety of airports by tracking anomalous objects such as drones. Reinforcement learning is an artificial intelligence method that outperforms humans on certain specific tasks. However, there exists a lack of data and benchmarks for pan–tilt–zoom control mechanisms in tracking airborne objects. Here, we show a simulated environment that contains a pan–tilt–zoom camera being used to train and evaluate a reinforcement learning agent. We found that the agent can learn to track the drone in our basic tracking scenario, outperforming a solved scenario benchmark value. The agent is also tested on more complex scenarios, where the drone is occluded behind obstacles. While the agent does not quantitatively outperform the optimal human model, it shows qualitative signs of learning to solve the complex, occluded non-linear trajectory scenario. Given further training, investigation, and different algorithms, we believe a reinforcement learning agent could be used to solve such scenarios consistently. Our results demonstrate how complex drone surveillance tracking scenarios may be solved and fully autonomized by reinforcement learning agents. We hope our environment becomes a starting point for more sophisticated autonomy in control of pan–tilt–zoom cameras tracking of drones and surveilling airspace for anomalous objects. For example, distributed, multi-agent systems of pan–tilt–zoom cameras combined with other sensors could lead towards fully autonomous surveillance, challenging experienced human operators.