Urban vibrancy is described by the activities of residents and their spatio-temporal dynamics. The metro station area (MSA) is one of the densest and most populous areas of the city. Thus, creating a vibrant and diverse urban environment becomes an important goal of transit-oriented development (TOD). Existing studies indicate that the built environment decisively determines MSA-level urban vibrancy. Meanwhile, the spatio-temporal heterogeneity of such effects requires thoroughly exploration and justification. In this study, we first apply mobile signaling data to quantify and decipher the spatio-temporal distribution characteristics of the MSA-level urban vibrancy in Chengdu, China. Then, we measure the built environment of the MSA by using multi-source big data. Finally, we employ geographically and temporally weighted regression (GTWR) models to examine the spatio-temporal non-stationarity of the impact of the MSA-level built environment on urban vibrancy. The results show that: 1) The high-vibrant MSAs concentrate in the commercial center and the employment center. 2) Indicators such as residential density, overpasses, road density, road network integration index, enterprise density, and restaurant density are significantly and positively associated with urban vibrancy, while indicators such as housing price and bus stop density are negatively associated with urban vibrancy. 3) The GTWR model better fits the data than the stepwise regression model. The impact of the MSA-level built environment on urban vibrancy shows a strong non-stationarity in both spatial and temporal dimensions, which matches with the spatio-temporal dynamic patterns of the residents’ daily work, leisure, and consumption activities. The findings can provide references for planners and city managers on how to frame vibrant TOD communities.