Nowadays, Wireless Sensor Networks (WSNs) are playing a vital and sustainable role in many verticals touching different aspects of our lives including civil, public, and military applications. WSNs majorly consist of a few to several sensor nodes, that are connected to each other via wireless communication links and require real-time or delayed data transfer. In this paper, we propose an autonomous Unmanned Aerial Vehicle (UAV)-enabled data gathering mechanism for delay-tolerant WSN applications. The objective is to employ a self-trained UAV as a flying mobile unit collecting data from ground sensor nodes spatially distributed in a given geographical area during a predefined period of time. In this approach, two Reinforcement Learning (RL) approaches, specifically Deep Deterministic Gradient Decent (DDPG) and Q-learning (QL) algorithms, are jointly employed to train the UAV to understand the environment and provide effective scheduling to accomplish its data collection mission. The DDPG is used to autonomously decide the best trajectory to adopt in an obstacle-constrained environment, while the QL is developed to determine the order of nodes to visit such that the data collection time is minimized. The schedule is obtained while considering the limited battery capacity of the flying unit, its need to return the charging station, the time windows of data acquisition, and the priority of certain sensor nodes. Customized reward functions are designed for each RL model and, through numerical simulations, we investigate their training performances. We also analyze the behavior of the autonomous UAV for different selected scenarios and corroborate the ability of the proposed approach in performing effective data collection. A comparison with the deterministic optimal solution is provided to validate the performance of the learning-based approach. INDEX TERMS Internet-of-things, data gathering, reinforcement learning, scheduling, unmanned areal vehicles