UAV networks have become a promising approach to provide wireless coverage to regions with limited connectivity. The combination of UAV networks and technologies such as the internet of things (IoT), have resulted in an enhancement in the quality of life of people living in rural areas. Therefore, it is crucial to implement fast, low-complexity, and effective strategies for UAV placement and resource allocation. In this chapter, a deep reinforcement learning (DRL) solution, based on the proximal policy optimization (PPO) algorithm, is proposed to maximize the coverage provided to users requesting microservice-based IoT applications. In order to maximize the coverage and autonomously adapt to the environment in real time, the algorithm aims to find optimal flight paths for the set of UAVs, considering the location of the users and flight restrictions. Simulation results over a realistic scenario show that the proposed solution is able to maximize the percentage of covered users.