10The world is rapidly urbanizing with 66% of the world's population expected to reside in cities by 2050. 11 This massive influx of new urban citizens is putting enormous pressure on city systems and bringing forth 12 challenges at the intersection of urban infrastructure, governance and the environment. As a result, 13 researchers and practitioners have turned to new advanced sensing and data analytics developed under the 14 burgeoning "smart city" movement to improve the design, management and operations of urban systems. 15However, data emerging from urban systems has been challenging to integrate, organize and analyze due 16 to their natural spatial, temporal and typological heterogeneity. In this paper, an Urban Data Integration 17 (UDI) framework is introduced that is capable of integrating heterogeneous urban data. The proposed UDI 18 framework is extensible to multiple types of urban systems, scalable to the growing amount and quickly 19 changing urban data streams and interpretable enough to help inform municipal decision-making. The UDI 20 framework utilizes a series of novel proximity relationship learning algorithms to automatically reconstruct 21 urban data in a graph database. The merits, applicability and efficacy of the proposed framework is 22 demonstrated by validating and testing it on data from a mid-size city in the United States and by 23 benchmarking its interpretability and computational performance for a typical urban analytics scenario 24 against current practice (i.e., a relational database). Results indicate that the UDI framework provides easier 25 and more computationally efficient exploration and querying of urban data and in turn can enable new 26 computational approaches to urban system design, management and operations. 27Keyword: data integration, graph database, proximity learning, smart city, urban data 28The heterogeneity of urban data and the nascent field of urban analytics both point towards the need for 53 integrating urban data early and often. For example, if new data becomes available on the air quality along 54 a main traffic corridor, a municipal official could want to map this new information to existing data sources 55 on related systems (e.g., traffic, roads), re-run analytical queries to understand mutual influences and then 56 take appropriate actions. Maintaining a high-level of interpretability is vital during the integration process 57 as the goal is to support urban design and operational decisions by municipal officials, policy-makers and 58 engineers. As a result, a useful urban data integration framework must be extensible to multiple urban 59 systems (and not system specific), scalable to the growing amounts of quickly changing urban data streams 60 and interpretable such that it can inform decision-making. 61In this paper, an Urban Data Integration (UDI) framework is introduced that integrates heterogeneous urban 62 data while maintaining extensibility, scalability and interpretability. The proposed UDI framework utilizes 63 a series of novel proxi...