When people are engaged in social interaction, they often repeat each other’s communicative behavior, such as words or gestures. This kind of alignment has been studied across a wide range of disciplines and has been accounted for by diverging theories. In this paper, we review various operationalizations of lexical and gestural alignment. We reveal that scholars have fundamentally different takes on when and how behavior is considered to be aligned, which makes it difficult to compare findings and draw uniform conclusions. Furthermore, we show that scholars tend to focus on one particular dimension of alignment (traditionally, whether two instances of behavior overlap in form), yet underspecify, conflate or neglect other dimensions. This stands in the way of proper theory testing and building, which requires a well-defined account of the factors that are central to or might enhance alignment. To capture the complex nature of alignment, we identify five key dimensions to formalize the relationship between any pair of behavior: sequence, time, semantics, form and modality. We show how assumptions regarding the underlying mechanism of alignment (categorized into priming versus grounding) pattern together with the operationalization in terms of the five dimensions. This conceptual framework can help researchers in the field of alignment and related phenomena (including behavior matching, mimicry, entrainment and accommodation) to formulate their hypotheses and operationalizations in a more transparent and systematic manner. The framework also enables us to discover unexplored research avenues and derive new hypotheses from existing theories.