The traffic forecasting problem is a challenging task that requires spatial-temporal modeling and gathers research interests from various domains. In recent years, spatial-temporal deep learning models have improved the accuracy and scale of traffic forecasting. While hundreds of models have been suggested, they share similar modules, or building blocks, which can be categorized into three temporal feature extraction methods of recurrent neural networks, convolution, and self-attention and two spatial feature extraction methods of convolutional graph neural networks (GNN) and attentional GNN. More importantly, the models have been mostly evaluated for their entire architectures with limited efforts to characterize and understand the performance of each category of building blocks. In this study, we conduct an extensive, multifaceted experiment to understand the influence of building block selection on traffic forecasting accuracy, considering environmental characteristics and dataset distributions. Specifically, we implement six traffic forecasting models using three temporal and two spatial building blocks. When we evaluate the models on four datasets with diverse characteristics, the results show each building block demonstrates distinguishable characteristics depending on study sites, prediction horizons, and traffic categories. The convolution models demonstrate higher overall forecasting performance than other models, whereas self-attention models show competitiveness in less frequent traffic categories, transition states, and the presence of outliers. Based on the results, we also suggest an adaptive model evaluation framework for category-wise predictions of test sets based on the performance of the models on validation sets. The results of this evaluation framework demonstrate improved forecasting accuracy at most by 3.7% without further sophistication in existing model architectures. The results enhance the utility of existing models and suggest guidelines for researchers building traffic forecasting model architectures and for practitioners implementing these state-of-the-art techniques in real-world applications.