This study was primarily conducted to investigate the potential use of pumpkin seed oil in biodiesel production. Initially, the fatty acid composition of oils extracted from discarded pumpkin seeds was determined. Then, biodiesel produced from discarded pumpkin seed oil was tested in an engine test setup. The performance and emission values of a four-cylinder diesel engine fueled with diesel (D100), biodiesel (PB100), and blended fuels (PB2D98, PB5D95, and PB20D80) were determined. Furthermore, three distinctive machine learning algorithms (artificial neural networks, XGBoost, and random forest) were employed to model engine performance and emission parameters. Models were generated based on the data from the PB100, PB2D98, and PB5D95 fuels, and model performance was assessed through the R2, RMSE, and MAPE metrics. The highest torque value (333.15 Nm) was obtained from 1200 rpm of D100 fuel. PB2D98 (2% biodiesel–98% diesel) had the lowest specific fuel consumption (194.33 g HPh−1) at 1600 rpm. The highest BTE (break thermal efficiency) value (30.92%) was obtained from diesel fuel at 1400 rpm. Regarding the blended fuels, PB2D98 exhibited the most fuel-efficient performance. Overall, in terms of engine performance and emission values, PB2M98 showed the closest results to diesel fuel. A comparison of machine learning algorithms revealed that artificial neural networks (ANNs) generally performed the best. However, the XGBoost algorithm proved to be more successful than other algorithms at predicting the performance and emissions of PB20D80 fuel. The present findings demonstrated that the XGBoost algorithm could be a more reliable option for predicting engine performance and emissions, especially for data-deficient fuels such as PB20D80.