A significant shortcoming of Homogeneous charge compression ignition (HCCI) engines is their narrow operating envelope, particularly with high-reactivity fuels like biodiesel. Additionally, the compositional variability of biodiesels, reflected in cetane number variations based on different source feedstocks, poses another challenge. While various strategies have successfully extended the maximum load limit in diesel-HCCI engines, they have not been adequately explored in biodiesel-fueled HCCI engines. This study is the first to comprehensively investigate the interplay of biodiesel composition, cetane number, engine compression ratio, and charge dilution strategy for extending the operating envelope of a light-duty HCCI engine. Given the impracticality of controlling multiple variables experimentally, this work focuses on the potential of machine learning (ML) algorithms to predict the operational limits of an HCCI engine using neat biodiesels derived from diverse sources. Among the ML models explored, artificial neural networks were the most accurate in predicting the minimum stable load, with prediction errors of 3.5% (calibration) and 3.9% (validation). Support vector machine models predicted the maximum load (with and without dilution) with errors below 5%. Notably, biodiesel produced from a blend of linseed, karanja, coconut, and mustard oils in specific proportions (42%, 3%, 25%, and 30% mass, respectively) yielded a wide HCCI operating range from 0.4 to 3.25 bar (without dilution) and up to 3.67 bar BMEP with charge dilution. This study highlights the novelty and practicality of using ML to predict and extend the operating envelope for biodiesel-fueled HCCI engines, demonstrating their suitability for future applications.