Traffic prediction plays a significant role in Intelligent Transportation Systems (ITS). Although many datasets have been introduced to support the study of traffic prediction, most of them only provide time-series traffic data. However, urban transportation systems are always susceptible to various factors, including unusual weather and traffic accidents. Therefore, relying solely on historical data for traffic prediction greatly limits the accuracy of the prediction. In this paper, we introduce Beijing Text-Traffic (BjTT), a large-scale multimodal dataset for traffic prediction. BjTT comprises over 32,000 time-series traffic records, capturing velocity and congestion levels on more than 1,200 roads within the 5th ring area of Beijing. Meanwhile, each piece of traffic data is coupled with a text describing the traffic system (including time, location, and events). We detail the data collection and processing procedures and present a statistical analysis of the BjTT dataset. Furthermore, we conduct comprehensive experiments on the dataset with state-of-the-art traffic prediction methods and text-guided generative models, which reveal the unique characteristics of the BjTT. The dataset is available at https://github.com/ChyaZhang/BjTT.