A smart city improves operational efficiency and comfort of living by harnessing techniques such as the Internet of Things (IoT) to collect and process data for decision making. To better support smart cities, data collected by IoT should be stored and processed appropriately. However, IoT devices are often task-specialized and resource-constrained, and thus, they heavily rely on online resources in terms of computing and storage to accomplish various tasks. Moreover, these cloud-based solutions often centralize the resources and are far away from the end IoTs and cannot respond to users in time due to network congestion when massive numbers of tasks offload through the core network. Therefore, by decentralizing resources spatially close to IoT devices, mobile edge computing (MEC) can reduce latency and improve service quality for a smart city, where service requests can be fulfilled in proximity. As the service demands exhibit spatial-temporal features, deploying MEC servers at optimal locations and allocating MEC resources play an essential role in efficiently meeting service requirements in a smart city. In this regard, it is essential to learn the distribution of resource demands in time and space. In this work, we first propose a spatio-temporal Bayesian hierarchical learning approach to learn and predict the distribution of MEC resource demand over space and time to facilitate MEC deployment and resource management. Second, the proposed model is trained and tested on real-world data, and the results demonstrate that the proposed method can achieve very high accuracy. Third, we demonstrate an application of the proposed method by simulating task offloading. Finally, the simulated results show that resources allocated based upon our models' predictions are exploited more efficiently than the resources are equally divided into all servers in unobserved areas.