Commentators interested in the societal implications of automated decision-making often overlook how decisions are made in the technology’s absence. For example, the benefits of ML and big data are often summarized as efficiency, objectivity, and consistency; the risks, meanwhile, include replicating historical discrimination and oversimplifying nuanced situations. While this perspective tracks when technology replaces capricious human judgements, it is ill-suited to contexts where standardized assessments already exist. In spaces like employment selection, the relevant question is how an ML model compares to a manually built test. In this paper, we explain that since the Civil Rights Act, industrial and organizational (I/O) psychologists have struggled to produce assessments without disparate impact. By examining the utility of ML for conducting exploratory analyses, coupled with the back-testing capability offered by advances in data science, we explain modern technology’s utility for hiring. We then empirically investigate a commercial hiring platform that applies several oft-cited benefits of ML to build custom job models for corporate employers. We focus on the disparate impact observed when models are deployed to evaluate real-world job candidates. Across a sample of 60 jobs built for 26 employers and used to evaluate approximately 400,00 candidates, minority-weighted impact ratios of 0.93 (Black–White), 0.97 (Hispanic–White), and 0.98 (Female–Male) are observed. We find similar results for candidates selecting disability-related accommodations within the platform versus unaccommodated users. We conclude by describing limitations, anticipating criticisms, and outlining further research.