Healthcare organizations and workers are under pressure to produce increasingly complete and accurate data for multiple data-intensive endeavors. However, little research has examined the emerging occupations arising to carry out the data work necessary to produce “improved” data sets, or the specific work activities of these emerging data occupations. We describe the work of Clinical Documentation Integrity Specialists (CDIS), an emerging occupation that focuses on improving clinical documentation to produce more detailed and accurate administrative datasets crucial for evolving data-intensive forms of healthcare accountability, management, and research. Using ethnographic methods, we describe the core of CDIS’ work as a translation practice in which the language, interests, and concerns of clinicians and clinical documentation are translated via real-time “nudging” and ongoing education of clinicians into the language, interests, and concerns of medical coders, structured administrative datasets, and the various stakeholders of these datasets. Further, we show how the institutional context of CDIS’ work shapes the occupational virtues that guide CDIS’ translation practice, including financial reimbursement, quality measures, clinical accuracy, and protecting clinician’s time. Despite the existence of these multiple virtues, financial reimbursement is the most prominent virtue guiding CDIS’ limited attention. Thus, overall clinical documentation is “improved” in specific, partial ways. This research provides one of the first studies of the emergent data work occupations arising in the wake of digitization and big data opportunities, and shows how local data settings shape large scale data in specific ways and thus may influence outcomes of analyses based on such data.