This paper proposes a novel approach called cross-scale with attention normalizing flow (CSA-Flow) enhanced with channel-attention (CA) and self-attention (SA) modules for high-speed railway anomaly detection in complex industrial backgrounds to reduce the manual workload of the primary maintenance of high-speed electric multiple units. Detecting defects in industrial environments, characterized by intricate backgrounds and unclear subjects, poses significant challenges. To address this, CSA-Flow introduces a channel feature extraction module that combining the pretrained convolutional neural network models with a CA module for feature extraction, capturing information at different scales, and uses the SA module to capture more contextual information by its larger receptive field. The performance evaluation of CSA-Flow on the MVTec-AD dataset demonstrates an impressive area under the receiver operating characteristic curve (AUROC) score of 98.7%, with an equally remarkable score of 98.4% across all object classes. To further assess the effectiveness of CSA-Flow in complex background scenarios, we introduce a dedicated dataset, specifically designed for high-speed rail braking devices (HSRBDs). The experimental results establish the superiority of CSA-Flow over current state-of-the-art approaches in terms of both AUROC score and recall score, validating its exceptional capability for detecting anomalies in industrial complex backgrounds.