“…A major limitation of the studies presented in this review is that many approaches were either not tested with users (17 papers), or when they did, limited details of the testing were published, failing to describe where the participants were recruited from, how many were recruited, or if the participants were knowledgeable in Machine Learning (Pynadath et al, 2018 ; Tabrez and Hayes, 2019 ; Tabrez et al, 2019 ). Participant counts varied greatly, with one paper using 3 experts (Wang et al, 2018 ), others with students (Iyer et al, 2018 ), n = 40; and Greydanus et al ( 2018 ), n = 31, and three recruiting using Amazon Mechanical Turk 3 (Huang et al, 2019 , n = 191; Madumal et al, 2020 , n = 120; and Ehsan et al, 2019 , n = 65 and n = 60).…”