This work aims to implement and use machine learning algorithms to predict the yield of bio-oil during the pyrolysis of lignocellulosic biomass based on the physicochemical properties and composition of the biomass feed and pyrolysis conditions. The biomass pyrolysis process is influenced by different process parameters, such as pyrolysis temperature, heating rate, composition of biomass, and purge gas flow rate. The inter-relation between the yield of different pyrolysis products and process parameters can be well predicted by using different machine learning algorithms. In this study, different machine learning algorithms, namely, multi-linear regression, gradient boosting, random forest, and decision tree, have been trained on the dataset and the models are compared to identify the optimum method for the determination of bio-oil yield prediction model. Analysis of the results showed the gradient boosting method to possess a regression score of 0.97 and 0.89 for the training and testing sets with root-mean-squared error (RMSE) values of 1.19 and 2.39, respectively, and overcome the problem of overfitting. Therefore, the present study provides an approach to train a generalized machine learning model, which can be employed on large datasets while avoiding the error of overfitting.