This paper introduces a hidden topic-based framework for processing short and sparse documents (e.g., search result snippets, product descriptions, book/movie summaries, and advertising messages) on the Web. The framework focuses on solving two main challenges posed by these kinds of documents: 1) data sparseness and 2) synonyms/homonyms. The former leads to the lack of shared words and contexts among documents while the latter are big linguistic obstacles in natural language processing (NLP) and information retrieval (IR). The underlying idea of the framework is that common hidden topics discovered from large external data sets (universal data sets), when included, can make short documents less sparse and more topic-oriented. Furthermore, hidden topics from universal data sets help handle unseen data better. The proposed framework can also be applied for different natural languages and data domains. We carefully evaluated the framework by carrying out two experiments for two important online applications (Web search result classification and matching/ranking for contextual advertising) with large-scale universal data sets and we achieved significant results.
We develop the first bisimulation-based method of concept learning, called BBCL, for knowledge bases in description logics (DLs). Our method is formulated for a large class of useful DLs, with well-known DLs like ALC, SHIQ, SHOIQ, SROIQ. As bisimulation is the notion for characterizing indiscernibility of objects in DLs, our method is natural and very promising.
Description logics (DLs) are a suitable formalism for representing knowledge about domains in which objects are described not only by attributes but also by binary relations between objects. Fuzzy extensions of DLs can be used for such domains when data and knowledge about them are vague and imprecise. One of the possible ways to specify classes of objects in such domains is to use concepts in fuzzy DLs. As DLs are variants of modal logics, indiscernibility in DLs is characterized by bisimilarity. The bisimilarity relation of an interpretation is the largest auto-bisimulation of that interpretation. In DLs and their fuzzy extensions, such equivalence relations can be used for concept learning. In this paper, we define and study fuzzy bisimulation and bisimilarity for fuzzy DLs under the Gödel semantics, as well as crisp bisimulation and strong bisimilarity for such logics extended with involutive negation. The considered logics are fuzzy extensions of the DL ALC reg (a variant of PDL) with additional features among inverse roles, nominals, (qualified or unqualified) number restrictions, the universal role, local reflexivity of a role and involutive negation. We formulate and prove results on invariance of concepts under fuzzy (resp. crisp) bisimulation, conditional invariance of fuzzy TBoxex/ABoxes under bisimilarity (resp. strong bisimilarity), and the Hennessy-Milner property of fuzzy (resp. crisp) bisimulation for fuzzy DLs without (resp. with) involutive negation under the Gödel semantics. Apart from these fundamental results, we also provide results on using fuzzy bisimulation to separate the expressive powers of fuzzy DLs, as well as results on using strong bisimilarity to minimize fuzzy interpretations. * This is a revised and corrected version of the publication "Bisimulation and bisimilarity for fuzzy description logics under the Gödel semantics", Fuzzy Sets and Systems 388: 146-178 (2020)
Abstract. We prove that any concept in any description logic that extends ALC with some features amongst I (inverse), Q k (quantified number restrictions with numbers bounded by a constant k), Self (local reflexivity of a role) can be learnt if the training information system is good enough. That is, there exists a learning algorithm such that, for every concept C of those logics, there exists a training information system consistent with C such that applying the learning algorithm to the system results in a concept equivalent to C.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.