Intelligent Transportation Systems (ITS) research and applications benefit from accurate shortterm traffic state forecasting. To improve the forecasting accuracy, this paper proposes a deep learning based multitask learning Gated Recurrent Units (MTL-GRU) with residual mappings. To enhance the performance of the MTL-GRU, feature engineering is introduced to select the most informative features for the forecasting. Then, based on real-world datasets, numerical results show that the MTL-GRU can well estimate traffic flow and speed simultaneously, and performs better than other counterparts. Experiments also show that the deep learning based MTL-GRU model can overpower the bottleneck caused by enlarging training datasets and continue to gain benefits. The results suggest the proposed MTL-GRU model with residual mappings is promising to forecast short-term traffic state. INDEX TERMS Short-term traffic forecasting, deep learning, multitask learning, feature engineering.