The potential for the use of drones in logistics and transportation is continuously growing, with multiple applications both in urban and rural environments. The safe navigation of drones in such environments is a major challenge that requires sophisticated algorithms and systems, that can quickly and efficiently estimate the situation, find the shortest path to the target, and detect and avoid obstacles. Traditional path planning algorithms are unable to handle the dynamic and uncertain nature of real environments, while traditional machine learning models are insufficient due to the constantly changing conditions that affect the drone location and the location of obstacles. Reinforcement learning (RL) algorithms have been widely used for autonomous navigation problems, however, computational complexity and energy demands of such methods can become a bottleneck to the execution of UAV flights. In this paper, we propose the use of a minimum set of sensors and reinforcement learning (RL) algorithms for the safe and efficient navigation of drones in urban and rural environments. Our approach considers the complex and dynamic nature of such environments by incorporating real-time data from low-cost onboard sensors. After performing a thorough review of the existing solutions in drone path planning and navigation in 3-D environments, we experimentally evaluate our proposed approach in a simulated environment, and in various scenarios. The test results demonstrate the effectiveness of the proposed RL-based approach in the navigation of drones in complex and unconstrained environments. The implemented approach can serve as a basis for the development of advanced and robust navigation systems for drones, which can improve safety and efficiency in transportation applications in the near future.INDEX TERMS actor-critic algorithm, drone navigation, dynamic path planning, proximal policy optimization, reinforcement learning.