Ultra-Wideband (UWB) technology is crucial for indoor localization systems due to its high accuracy and robustness in multipath environments. However, Non-Line-of-Sight (NLoS) conditions can cause UWB signal distortion, significantly reducing positioning accuracy. Thus, distinguishing between NLoS and LoS scenarios and mitigating positioning errors is crucial for enhancing UWB system performance. This research proposes a novel 1D-ConvLSTM-Attention network (1D-CLANet) for extracting UWB temporal channel impulse response (CIR) features and identifying NLoS scenarios. The model combines the convolutional neural network (CNN) and Long Short-Term memory (LSTM) architectures to extract temporal CIR features and introduces the Squeeze-and-Excitation (SE) attention mechanism to enhance critical features. Integrating SE attention with LSTM outputs boosts the model’s ability to differentiate between various NLoS categories. Experimental results show that the proposed 1D-CLANet with SE attention achieves superior performance in differentiating multiple NLoS scenarios with limited computational resources, attaining an accuracy of 95.58%. It outperforms other attention mechanisms and the version of 1D-CLANet without attention. Compared to advanced methods, the SE-enhanced 1D-CLANet significantly improves the ability to distinguish between LoS and similar NLoS scenarios, such as human obstructions, enhancing overall recognition accuracy in complex environments.