Vibration-based structural damage detection is one of the most promising venues for building smart and automated structural health monitoring applications; however, its applicability is impeded by a large amount of collected vibration data, and the performance could be undermined by degraded data. Therefore, this study develops a robust framework, dubbed AutoBoost-SDD, that can effectively handle contaminated vibration data and provide reliable monitoring results within reasonable computational resources. The proposed method consists of three key components. Firstly, multi-domain feature extraction techniques are utilized to convert high-dimensional raw data into informative feature vectors. Secondly, the auto-encoder deep learning architecture is leveraged to refine feature vectors of contaminated data. Finally, a tree-based boosting machine learning algorithm, namely LightGBM, is employed to assess the structures’ operational states using learned output from the second step. The viability and performance of the proposed framework are illustrated via three case studies involving numerical data of a 5-degree of freedom system, a 2D frame structure, and experimental data of a large-scale 18-story frame structure from the literature. The results show that the AutoBoost-SDD framework is able to provide reasonable detection results despite the presence of various contaminations, including noisy, missing, and anomalous data.