Vibrational measurements play an important role for structural health monitoring, e.g., modal extraction and damage diagnosis. Moreover, conditions of civil structures can be mostly assessed by displacement responses. However, installing displacement transducers between the ground and floors in real-world buildings is unrealistic due to lack of reference points and structural scales and complexity. Alternatively, structural displacements can be acquired using computer vision-based motion extraction techniques. These extracted motions not only provide vibrational responses but are also useful for identifying the modal properties. In this study, three methods, including the optical flow with the Lucas–Kanade method, the digital image correlation (DIC) with bilinear interpolation, and the in-plane phase-based motion magnification using the Riesz pyramid, are introduced and experimentally verified using a four-story steel-frame building with a commercially available camera. First, the three displacement acquiring methods are introduced in detail. Next, the displacements are experimentally obtained from these methods and compared to those sensed from linear variable displacement transducers. Moreover, these displacement responses are converted into modal properties by system identification. As seen in the experimental results, the DIC method has the lowest average root mean squared error (RMSE) of 1.2371 mm among these three methods. Although the phase-based motion magnification method has a larger RMSE of 1.4132 mm due to variations in edge detection, this method is capable of providing full-field mode shapes over the building.