In recent years, there has been growing enthusiasm for employing Unmanned Aerial Vehicles (UAVs) as an innovative technology with significant potential for the next generation of wireless networks. Hence, the Quality of Service (QoS) of clients within the Internet of Things (IoT) infrastructure can be considerably enhanced by integrating UAVs into wireless systems. However, maximizing QoS and ensuring user information security poses significant challenges in determining the trajectory of UAVs, as they need to account for the mobility and density of users. This article addresses this issue and considers a heterogeneous UAV-assisted network in which UAVs as Base Stations (BSs) have the role of finding the best trajectory based on the movement and density of Ground Users (GUs) in a way that the enhancement of the security of the covered users' information and QoS is guaranteed. To achieve these objectives, we first introduce the Actor-Critic (AC) scheme for the optimal trajectory of UAVs due to the use of a continuous environment with a large number of states and actions. In addition, the accuracy and learning speed of the agent in AC algorithms is higher than in other Reinforcement Learning (RL) schemes. Then, the combination of ordinary Federated Learning (FL) and Distillation Federated Learning (DFL) algorithms with AC is presented to increase the security of users' data, the learning speed of UAVs, and improve the QoS of GUs. In the FL scheme, access to the principal information of users is limited, but it cannot prevent reverse engineering techniques to attain users' data. To intercept such incidents, we use the DFL scheme. Due to the dynamic nature of weight updating in the neural network, this method is associated with a low hacking probability. Simulation results demonstrate that utilizing FL-and DFL-AC algorithms can significantly enhance the learning speed of the proposed schemes, while increasing the downlink rate of users by approximately 1.1% and 1.5%, respectively, compared to the AC method.INDEX TERMS Distillation federated learning (DFL), deep reinforcement learning (DRL), unmanned aerial vehicle (UAV), Actor-Critic (AC), multi-agent algorithm, UAVs' trajectory.