Traffic state prediction models are a crucial element with many applications in intelligent transportation systems. Short-term network-wide modeling of traffic states is a challenging task due to the existence of inherent characteristics such as nonlinearity, periodicity and stochasticity in the traffic state time series. This issue was responded by the evolution of advanced machine learning algorithms, e.g. deep learning. Deep neural networks can cope with high dimensionality, and also, are capable of extracting nonlinearity, comovement patterns, and spatiotemporal interdependencies between the traffic state variables from different locations. Nevertheless, they cannot completely capture the location-specific features of traffic information. Therefore, we propose the Discrete Haar Wavelet Transform (DHWT) as a preprocessing scheme prior to Multilayer Perceptron (MLP) neural networks for one-hour ahead traffic state prediction. DHWT can help MLP to simultaneously learn the network-wide comovement patterns through the trend component time series, and seize the significant characteristics of each unique detector efficiently via the noise component. The results on 20 sensors in Paris indicated that the hybrid DHWT-MLP model with a two-level down decomposition improves the Mean Squared Error (MSE) of a non-preprocessed MLP by 33.73% and 17.58 %, for the six-month and three-month data, respectively. However, the proposed model does not perform well over the one-month period compared to the MLP model.Therefore, it may be helpful to use lower wavelet decomposition levels (higher orders) when dealing with relatively small traffic datasets.