Enterprise Search (ES) is different from traditional IR due to a number of reasons, among which the high level of ambiguity of terms in queries and documents and existence of graph-structured enterprise data (ontologies) that describe the concepts of interest and their relationships to each other, are the most important ones.Our method identifies concepts from the enterprise ontology in the query and corpus. We propose a ranking scheme for ontology sub-graphs on top of approximately matched token q-grams. The ranking leverages the graph-structure of the ontology to incorporate not explicitly mentioned concepts. It improves previous solutions by using a fine-grained ranking function that is specifically designed to cope with high levels of ambiguity. This method is able to capture much more of the semantics of queries and documents than previous techniques. We prove this claim by an evaluation of our method in three real-life scenarios from two different domains, and found it to consistently be superior both in terms of precision and recall.
This chapter argues that the notion of identity of and reference to entities (objects, individuals, instances) is fundamental in order to achieve semantic interoperability and integration between different sources of knowledge. The first step in order to integrate different information sources about an entity is to recognize that those sources describe the same entity. Unfortunately, different systems that manage information about entities commonly issue different identifiers for these entities. This makes reference to entities across information systems very complicated or impossible, because there are no means to know how an entity is identified in another system. The authors propose a global, public infrastructure, the Entity Name System (ENS), which enables the creation and re-use of identifiers for entities. This a-priori approach enables systems to reference entities with a globally unique identifier, and makes semantic integration a much easier job. The authors illustrate two enterprise use cases which build on this approach: entity-centric publishing, and entity-centric corporate information management, currently being developed by two leading companies in their respective fields.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.