Knee osteoarthritis (OA) affects over 650 million patients worldwide. Total knee replacement is aimed at end-stage OA to relieve symptoms of pain, stiffness and reduced mobility. However, the role of imaging modalities in monitoring symptomatic disease progression remains unclear. This study aimed to compare machine learning (ML) models, with and without imaging features, in predicting the two-year Western Ontario and McMaster Universities Arthritis Index (WOMAC) score for knee OA patients. We included 2408 patients from the Osteoarthritis Initiative (OAI) database, with 629 patients from the Multicenter Osteoarthritis Study (MOST) database. The clinical dataset included 18 clinical features, while the imaging dataset contained an additional 10 imaging features. Minimal Clinically Important Difference (MCID) was set to 24, reflecting meaningful physical impairment. Clinical and imaging dataset models produced similar area under curve (AUC) scores, highlighting low differences in performance AUC < 0.025). For both clinical and imaging datasets, Gradient Boosting Machine (GBM) models performed the best in the external validation, with a clinically acceptable AUC of 0.734 (95% CI 0.687–0.781) and 0.747 (95% CI 0.701–0.792), respectively. The five features identified included educational background, family history of osteoarthritis, co-morbidities, use of osteoporosis medications and previous knee procedures. This is the first study to demonstrate that ML models achieve comparable performance with and without imaging features.