“…Thus, before a medical predictive model can be safely applied in clinical practice, it is crucial to test it not only in a single but in multiple datasets that are independent both to each other and to the data used during the development of the algorithm. As a matter of facts, the majority of medical device filings to regulatory bodies such as the US Food and Drugs Administration are based on multi-center clinical studies (Johnston, Dhruva et al 2020), and multi-centric testing seems to have progressively become more and more used in the recent literature of ML for medical applications (Abraham, Milham et al 2017, Meyer, Mueller et al 2017, Gabr, Coronado et al 2019. However, previous studies using ML to predict clinical course and treatment response prediction in OCD patients are mostly based on data recruited in a single center (Salomoni, Grassi et al 2009, Hoexter, Miguel et al 2013, Yun, Jang et al 2015, Mas, Gasso et al 2016, Lenhard, Sauer et al 2018, Reggente, Moody et al 2018, Metin, Balli Altuglu et al 2020.…”