In this work we conduct a comparative study of several publicly available, state-of-the-art hyper-heuristics for HyFlex in order to assess their generality across domains. To this purpose we extend the HyFlex benchmark set with 3 new problem domains: The 0-1 Knap Sack, Quadratic Assignment and Max-Cut Problem. To our knowledge, this is the first public extension of the benchmark since the CHeSC 2011 competition. In addition, this is the first study testing the Fair-Share Iterated Local Search (FS-ILS) method, designed in prior research, using a semi-automated design approach, on new unseen problem domains. We show that, of the methods compared, Adap-HH (CHeSC 2011 winner) clearly perfoms the most consistently, overall. In addition, we identify a weakness of, as well as a way to further simplify the FS-ILS method. Finally, we found that, overall, the state-of-the-art methods compared, generalized much better than a naive baseline.