Safely landing unmanned aerial vehicles (UAVs) in unknown environments that are denied by GPS is challenging but crucial. In most cases, traditional landing methods are not suitable, especially under complex terrain conditions with insufficient map information. This report proposes an innovative multi-stage UAV landing framework involving (i) point cloud and image fusion positioning, (ii) terrain analysis, and (iii) neural network semantic recognition to optimize landing site selection. In the first step, 3D point cloud and image data are fused to attain a comprehensive perception of the environment. In the second step, an energy cost function considering texture and flatness is employed to identify potential landing sites based on energy scores. To navigate the complexities of classification for precise landings, the results are stratified by the difficulty of various UAV landing scenarios. In the third step, a network model is applied to analyze UAV landing site options by integrating the ResNet50 network with a convolutional block attention module. Experimental results indicate a reduction in computational load and improved landing site identification accuracy. The developed framework fuses multi-modal data to enhance the safety and feasibility of UAV landings in complex environments.