Forecasting is one of the key applications of machine learning. The task of forecasting becomes complex when there are spatiotemporal dependencies in the data generating process. Prediction of congestion ahead of time is a very important aspect of transportation system management. Traffic congestion on a road network has a temporal component due to daily and weekly variation in human travel, and also a spatial component due to the connected nature of the road network and traffic flow. Furthermore, the spatial component of traffic congestion is certainly not Euclidean due to directionality of road network, which is not an undirected graph. Congestion prediction falls into the realm of time series data analysis methods which can be mapped onto a neural network-based methods for sequence prediction. In this research we propose Convolutional Long Short Term Memory (CLSTM) which incorporates spatial and temporary information into the forecasting process. To validate the efficiency of the proposed method, the performance is compared with various deep learning architectures of Gated Recurrent Unit (GRU), Long Short Term Memory (LSTM), and baseline methods such as Vector Autoregression (VAR) and historical average. Experiments include the above topologies with varying parameters as number of units per layer, number of layers, optimizers, learning rate and lengths of sequence input. Prediction comparison is demonstrated with tables and graphical representations.