This paper studies how to schedule wireless transmissions from sensors to estimate the states of multiple remote, dynamic processes. Sensors make observations of each of the processes. Information from the different sensors have to be transmitted to a central gateway over a wireless network for monitoring purposes, where typically fewer wireless channels are available than there are processes to be monitored. Such estimation problems routinely occur in large-scale Cyber-Physical Systems, especially when the dynamic systems (processes) involved are geographically separated. For effective estimation at the gateway, the sensors need to be scheduled appropriately, i.e., at each time instant to decide which sensors have network access and which ones do not. To solve this scheduling problem, we formulate an associated Markov decision process (MDP). Further, we solve this MDP using a Deep Q-Network, a deep reinforcement learning algorithm that is at once scalable and model-free. We compare our scheduling algorithm to popular scheduling algorithms such as round-robin and reduced-waitingtime, among others. Our algorithm is shown to significantly outperform these algorithms for randomly generated example scenarios.arXiv:1809.05149v1 [cs.SY]