To date, hundreds of researchers have employed the method of Qualitative Comparative Analysis (QCA) for the purpose of causal inference. In a recent series of simulation studies, however, several authors have questioned the correctness of QCA in this connection. Some prominent representatives of the method have replied in turn that simulations with artificial data are unsuited for assessing QCA. We take issue with either position in this impasse. On the one hand, we argue that data-driven evaluations of the correctness of a procedure of causal inference require artificial data. On the other hand, we prove all previous attempts in this direction to have been defective. For the first time in the literature on configurational comparative methods, we lay out a set of formal criteria for an adequate evaluation of QCA before implementing a battery of inverse-search trials to test how this method performs in different recovery contexts according to these criteria. While our results indicate that QCA is correct when generating the parsimonious solution type, they also demonstrate that the method is incorrect when generating the conservative and intermediate solution type. In consequence, researchers using QCA for causal inference, particularly in human-sensitive areas such as public health and medicine, should immediately discontinue employing the method’s conservative and intermediate search strategies.