Clinical trials require substantial effort and time to complete, and regulatory agencies may require two successful efficacy trials before approving a new drug. One way to improve the chance of follow‐up success is to identify a subpopulation among whom treatment effects are estimated to be beneficial, and enrolling future studies from this subpopulation. In this article we study confirmable responder class (CRC)
learning, where the objective is to learn in a random half of the dataset (training set) a subpopulation among whom the predicted conditional ATE (CATE) suggests clinically meaningful benefit, with maximum power of being confirmed via hypothesis test in the other half (test set). We studied a set of CRC learners across simulated datasets in which either all patients benefited, or only some did. Performance metrics included the rate of confirmation in the test set, and the classification accuracy of the CRC compared with the group with true CATE>0. In trials where all patients benefitted, only two learners were able to consistently identify most of the population as the CRC. One of these also performed especially well when only some patients benefitted, having relatively high confirmation rates, and showing robustness to the dimension of the covariate vector and population characteristics. This learner is based on cross‐validation, and is a possible avenue for further development of model selection criteria for CRC learning. We also show that the performance of all methods can be improved by using both halves of the original dataset as training and test sets in turn.