Before traffic forecasting, it is usually necessary to aggregate the information by a certain length of time. An aggregation size that is too short will make the data unstable and cause the forecast result to be too biased. On the other hand, if the aggregation size is too large, the data information will be lost, resulting in the forecast results tending towards an average or slow response. With the development of intelligent transportation systems, especially the development of urban traffic control systems, high requirements are placed on the real-time accuracy of traffic forecasting. Therefore, it is an essential topic of traffic forecasting research to determine aggregation sizes. In this paper, the mutual information between the forecast input information and the forecast result and the sequence complexity of the forecast result measured by approximate entropy, sample entropy, and fuzzy entropy are considered; then, the optimal data aggregation size is given. To verify the proposed method, the validated data obtained from the simulation is aggregated and calculated with different aggregation sizes, then used for forecasting. By comparing the prediction performance of different aggregate sizes, the optimal aggregate size was found to reduce MSE by 14–30%. The results show that the method proposed in this paper is helpful for selecting the optimal data aggregation size in forecasting and can improve the performance of prediction.