Goldammer et al. (2020) examined the performance of careless response detection indices by experimentally manipulating survey instructions to induce careless responding, then compared the ability of various indices to detect these induced careless responses. Based on these analyses, Goldammer et al. concluded that metrics designed to detect overly consistent response patters (i.e. longstring and IRV) were ineffective. In this comment, we critique this conclusion by highlighting critical problems with the experimental manipulation used. Specifically, Goldammer et al.’s manipulations only encouraged overly inconsistent, or random, responding and thus did not induce the full range of careless response behavior that is present in natural careless responding. As such, it is unsurprising that metrics designed to detect overly consistent responding did not appear to be effective. Because the full range of careless behavior was not induced, Goldammer et al.’s study cannot address the utility of longstring or similar metrics. We offer recommendations for alternative experimental manipulations that may produce more naturalistic and diverse careless responding.