PurposeMachine learning (ML) models are increasingly being used in industrial maintenance to predict system failures. However, less is known about how the time windows for reading data and making predictions affect performance. Therefore, the purpose of this research is to assess the impact of different sliding windows on prediction performance.Design/methodology/approachThe authors conducted a factorial experiment using high dimensional machine data covering two years of operation, taken from a real industrial case for the production of high-precision milled and turned parts. The impacts of different reading and prediction windows were tested for three ML algorithms (random forest, support vector machines and logistic regression) and four metrics (accuracy, precision, recall and F-score).FindingsThe results reveal (1) the critical role of the prediction window contingent upon the application domain, (2) a non-monotonic relationship between the reading window and performance, and (3) how sliding window selection can systematically be used to improve different facets of performance.Originality/valueThe study's findings advance the knowledge of ML-based failure prediction, by highlighting how systematic variation of two important but yet understudied factors contributes to the development of more useful prediction models.