Obstacle avoidance for small unmanned aircraft is vital for the safety of future urban air mobility (UAM) and Unmanned Aircraft System (UAS) Traffic Management (UTM). There are a variety of techniques for real-time robust drone guidance, but numerous of them solve in discretized airspace and control, which would require an additional path smoothing step to provide flexible commands for UAS. To deliver safe and computationally efficient guidance for UAS operations, we explore the use of a deep reinforcement learning algorithm based on Proximal Policy Optimization (PPO) to lead autonomous UAS to their destinations while bypassing obstacles through continuous control. The proposed scenario state representation and reward function can map the continuous state space to continuous control for both heading angle and speed. To verify the effectiveness of the proposed learning framework, we conducted numerical experiments with static and moving obstacles. Uncertainties associated with the environments and safety operation bounds are investigated in detail. Results show that the proposed model is able to provide accurate and robust guidance and resolve conflict with a success rate of over 99%.INDEX TERMS continuous control, deep reinforcement learning, UAS obstacle avoidance, uncertainty.