“…As a way out, one necessarily had to turn to benchmarking studies, with instances of seminal benchmarking attempts being reported by, for example, Baker (1974), Hubert (1974), and especially Milligan (Milligan, 1980, 1985; Milligan et al, 1983; Milligan & Cooper, 1985; for overviews of earlier benchmarking work in the area, see, e.g., Jain & Dubes, 1988; Milligan, 1981a, 1981b). Later on, there has been some follow‐up to this seminal work (e.g., Anderlucci & Hennig, 2014; Arbelaitz et al, 2013; Costa et al, 2022; Hennig, 2022; Rossbroich et al, 2022; Schepers et al, 2006; Shireman et al, 2017; Steinley, 2003; Steinley & Brusco, 2008; Šulc & Řezanková, 2019; Wilderjans et al, 2013). Nevertheless, there is much less of a benchmarking tradition in the clustering area than in the field of supervised learning.…”