Traffic congestion is a perennial issue because of the increasing traffic demand yet limited budget for maintaining current transportation infrastructure; let alone expanding them. Many congestion management techniques require timely and accurate traffic estimation and prediction. Examples of such techniques include incident management, real-time routing, and providing accurate trip information based on historical data. In this dissertation, a speech-powered traffic prediction platform is proposed, which deploys a new deep learning algorithm for traffic prediction using Connected Vehicles (CV) data. To speed-up traffic forecasting, a Graph Convolution -- Gated Recurrent Unit (GC-GRU) architecture is proposed and analysis of its performance on tabular data is compared to state-of-the-art models. GC-GRU's Mean Absolute Percentage Error (MAPE) was very close to Transformer (3.16 vs 3.12) while achieving the fastest inference time and a six-fold faster training time than Transformer, although Long-Short-Term Memory (LSTM) was the fastest in training. Such improved performance in traffic prediction with a shorter inference time and competitive training time allows the proposed architecture to better cater to real-time applications. This is the first study to demonstrate the advantage of using multiscale approach by combining CV data with conventional sources such as Waze and probe data. CV data was better at detecting short duration, Jam and stand-still incidents and detected them earlier as compared to probe. CV data excelled at detecting minor incidents with a 90 percent detection rate versus 20 percent for probes and detecting them 3 minutes faster. To process the big CV data faster, a new algorithm is proposed to extract the spatial and temporal features from the CSV files into a Multiscale Data Analysis (MDA). The algorithm also leverages Graphics Processing Unit (GPU) using the Nvidia Rapids framework and Dask parallel cluster in Python. The results show a seventy-fold speedup in the data Extract, Transform, Load (ETL) of the CV data for the State of Missouri of an entire day for all the unique CV journeys (reducing the processing time from about 48 hours to 25 minutes). The processed data is then fed into a customized UNet model that learns highlevel traffic features from network-level images to predict large-scale, multi-route, speed and volume of CVs. The accuracy and robustness of the proposed model are evaluated by taking different road types, times of day and image snippets of the developed model and comparable benchmarks. To visually analyze the historical traffic data and the results of the prediction model, an interactive web application powered by speech queries is built to offer accurate and fast insights of traffic performance, and thus, allow for better positioning of traffic control strategies. The product of this dissertation can be seamlessly deployed by transportation authorities to understand and manage congestions in a timely manner.