In this study, railway vehicle‐body vibration was applied to rail detection for convenient sensor deployment and cost‐effectiveness. However, the waveform is difficult to analyze due to damping and interference. Data‐driven methods can help concatenate multidimensional signals and complex rail‐surface irregularities but are impressionably uncertain. This study proposes a method in which a deep learning framework is coupled with heterogeneous factors at every link in its ensemble strategy. The instantiation, module foundation, and scenario description establish a concrete system for dealing with the dilemma that an insufficient database hinders feature extraction, causing unproductive capacity. The performance is quantitatively discussed by combining different levels of on‐site rail‐surface conditions, reaching prediction errors of 4.7% and 6.5% and classifier accuracies of 98.4% and 93.7% for irregularities and defect severities, respectively. This work describes a way to extend self‐learner applicability in industry and will facilitate new support for railway track management.