Engineers seek to design systems that will produce an intended change in the state of the world. How are we to know if a system will behave as intended? This article addresses ways that this question can be answered. Specifically, we focus on three types of research validity: (1) internal validity, or whether an observed association between two variables can be attributed to a causal link between them; (2) external validity, or whether a causal link generalizes across contexts; and (3) construct validity, or whether a specific set of metrics corresponds to what they are intended to measure. In each case, we discuss techniques that may be used to establish the corresponding type of validity: namely, quasi‐experimental design, replication, and establishment of convergent‐discriminant validity and reliability. These techniques typically require access to data, which has historically been limited for research on complex engineered systems. This is likely to change in the era of “big data.” Thus, we discuss the continued utility of these validity concepts in the face of advances in machine learning and big data as they pertain to complex engineered sociotechnical systems. Next, we discuss relationships between these validity concepts and other prominent approaches to evaluating research in the field. Finally, we propose a set of criteria by which one may evaluate research utilizing quantitative observation to test causal theory in the field of complex engineered systems.