Abstract-Semantic models of data sources describe the meaning of the data in terms of the concepts and relationships defined by a domain ontology. Building such models is an important step toward integrating data from different sources, where we need to provide the user with a unified view of underlying sources. In this paper, we present a scalable approach to automatically learn semantic models of a structured data source by exploiting the knowledge of previously modeled sources. Our evaluation shows that the approach generates expressive semantic models with minimal user input, and it is scalable to large ontologies and data sources with many attributes.I. INTRODUCTION A significant amount of information available on the Web is available in sources such as relational databases, spreadsheets, XML, JSON, and Web APIs. A common approach to integrate these sources involves building a domain model and constructing source descriptions that represent the intended meaning of the data by specifying mappings between the sources and the domain model [1]. In the Semantic Web, the domain model is an ontology that defines concepts and relationships within a domain, and source descriptions are formal specifications of semantic models. Semantic models can be viewed as graphs with ontology classes as the nodes and ontology properties as the links between the nodes.Manually constructing semantic models is a timeconsuming task that requires significant effort and expertise. Automatically generating these models involves two steps. The first step is specifying the semantic types, i.e., labeling each data field, or source attribute with a class or a data property of the domain ontology. However, simply annotating the attributes is not sufficient. A precise semantic model needs a second step that specifies the relationships between the source attributes in terms of the properties in the ontology. In Semantic Web research, there are many studies on mapping data sources to ontologies [2]- [7], but most focus on the first step of the modeling process or are very limited in automatically inferring the relationships.In our previous work [8], we presented a novel approach to learn semantic models of data sources from known semantic models, semantic models of sources that have already been modeled. The work is inspired by the idea that different sources in the same domain often provide similar or overlapping data and have similar semantic models. Given sample data from the new source, we use an existing machine learning