Based on the quasi-six-degree-of-freedom flight dynamic equations, considering the changes in the elevation angle caused by an increase in the rolling angle during maneuvering turns, which leads to a rise in the radar cross-section. A computational model for the radar detection probability of aircraft in complex environments was constructed. By comprehensively considering flight parameters such as turning angle, rolling angle, Mach number, and radar power factor, this study quantitatively analyzed the influence of these factors on the radar detection probability. It revealed the variation patterns of radar detection probability under different flight conditions. The results provide theoretical support for the Radar Valley Radius and Turning Maneuver Method (RVR-TM) based on decision trees, and lay the foundation for the development of subsequent intelligent decision-making models. To further optimize the trajectory selection of aircraft in complex environments, this study combines theoretical analysis with reinforcement learning algorithms to establish an intelligent decision-making model. This model is trained using the Proximal Policy Optimization (PPO) algorithm, and through precisely defining the state space and reward functions, it accomplishes intelligent trajectory planning for stealth aircraft under radar threat scenarios.