ObjectiveThis study aims to develop and validate machine learning models to predict proliferative lupus nephritis (PLN) occurrence, offering a reliable diagnostic alternative when renal biopsy is not feasible or safe.MethodsThis study retrospectively analyzed clinical and laboratory data from patients diagnosed with SLE and renal involvement who underwent renal biopsy at West China Hospital of Sichuan University between 2011 and 2021. We randomly assigned 70% of the patients to a training cohort and the remaining 30% to a test cohort. Various machine learning models were constructed on the training cohort, including generalized linear models (e.g., logistic regression, least absolute shrinkage and selection operator, ridge regression, and elastic net), support vector machines (linear and radial basis kernel functions), and decision tree models (e.g., classical decision tree, conditional inference tree, and random forest). Diagnostic performance was evaluated using ROC curves, calibration curves, and DCA for both cohorts. Furthermore, different machine learning models were compared to identify key and shared features, aiming to screen for potential PLN diagnostic markers.ResultsInvolving 1312 LN patients, with 780 PLN/NPLN cases analyzed. They were randomly divided into a training group (547 cases) and a testing group (233 cases). we developed nine machine learning models in the training group. Seven models demonstrated excellent discriminatory abilities in the testing cohort, random forest model showed the highest discriminatory ability (AUC: 0.880, 95% confidence interval(CI): 0.835–0.926). Logistic regression had the best calibration, while random forest exhibited the greatest clinical net benefit. By comparing features across various models, we confirmed the efficacy of traditional indicators like anti-dsDNA antibodies, complement levels, serum creatinine, and urinary red and white blood cells in predicting and distinguishing PLN. Additionally, we uncovered the potential value of previously controversial or underutilized indicators such as serum chloride, neutrophil percentage, serum cystatin C, hematocrit, urinary pH, blood routine red blood cells, and immunoglobulin M in predicting PLN.ConclusionThis study provides a comprehensive perspective on incorporating a broader range of biomarkers for diagnosing and predicting PLN. Additionally, it offers an ideal non-invasive diagnostic tool for SLE patients unable to undergo renal biopsy.