In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane's framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected original research articles in any language evaluating the OSATS as an assessment tool for any health professional. We iteratively and collaboratively extracted validity evidence from included articles to construct and evaluate the validity argument for varied uses of the OSATS. Twenty-nine articles met the inclusion criteria, all focussed on surgical technical skills assessment. We identified three intended uses for the OSATS, namely formative feedback, high-stakes assessment and program evaluation. Following Kane's framework, four inferences in the validity argument were examined (scoring, generalization, extrapolation, decision). For formative feedback and high-stakes assessment, there was reasonable evidence for scoring and extrapolation. However, for high-stakes assessment there was a dearth of evidence for generalization aside from inter-rater reliability data and an absence of evidence linking multi-station OSATS scores to performance in real clinical settings. For program evaluation, the OSATS validity argument was supported by reasonable generalization and extrapolation evidence. There was a complete lack of evidence regarding implications and decisions based on OSATS scores. In general, validity evidence supported the use of the OSATS for formative feedback. Research to provide support for decisions based on OSATS scores is required if the OSATS is to be used for higher-stakes decisions and program evaluation.