The research on lignocellulose pretreatments is generally performed through experiments that require substantial resources, are often time-consuming and are not always environmentally friendly. Therefore, researchers are developing computational methods which can minimize experimental procedures and save money. In this research, three machine learning methods, including Random Forest (RF), Extreme Gradient Boosting (XGB) and Support Vector Machine (SVM), as well as their ensembles were evaluated to predict acid-insoluble detergent lignin (AIDL) content in lignocellulose biomass. Three different types of harvest residue (maize stover, soybean straw and sunflower stalk) were first pretreated in a laboratory oven with hot air under two different temperatures (121 and 175 °C) at different duration (30 and 90 min) with the aim of disintegration of the lignocellulosic structure, i.e., delignification. Based on the leave-one-out cross-validation, the XGB resulted in the highest accuracy for all individual harvest residues, achieving the coefficient of determination (R2) in the range of 0.756–0.980. The relative variable importances for all individual harvest residues strongly suggested the dominant impact of pretreatment temperature in comparison to its duration. These findings proved the effectiveness of machine learning prediction in the optimization of lignocellulose pretreatment, leading to a more efficient lignin destabilization approach.