Natural language generation (NLG) refers to the process of producing text in a spoken language, starting from an internal knowledge representation structure. Augmentative and Alternative Communication (AAC) deals with the development of devices and tools to enable basic conversation for language-impaired people. We present an applied prototype of an AAC-NLG system generating written output in English and Hebrew from a sequence of Bliss symbols. The system does not "translate" the symbols sequence, but instead, it dynamically changes the communication board as the choice of symbols proceeds according to the syntactic and semantic content of selected symbols, generating utterances in natural language through a process of semantic authoring.
Computational linguistics methods are typically first developed and tested in English. When applied to other languages, assumptions from English data are often applied to the target language. One of the most common such assumptions is that a "standard" part-of-speech (POS) tagset can be used across languages with only slight variations. We discuss in this paper a specific issue related to the definition of a POS tagset for Modern Hebrew, as an example to clarify the method through which such variations can be defined. It is widely assumed that Hebrew has no syntactic category of modals. There is, however, an identified class of words which are modal-like in their semantics, and can be characterized through distinct syntactic and morphologic criteria. We have found wide disagreement among traditional dictionaries on the POS tag attributed to such words. We describe three main approaches when deciding how to tag such words in Hebrew. We illustrate the impact of selecting each of these approaches on agreement among human taggers, and on the accuracy of automatic POS taggers induced for each method. We finally recommend the use of a "modal" tag in Hebrew and provide detailed guidelines for this tag. Our overall conclusion is that tagset definition is a complex task which deserves appropriate methodology.
Abstract. Information Retrieval (IR) research has recently started addressing the information need of exploratory search. where the searcher may be unfamiliar with the domain or not have decided what is the goal of his query. A popular tool to support exploratory search is the use of faceted search. The implementation of faceted search requires that documents be annotated by metadata in the form of attributes and hierarchical categories. In many applications, the metadata is maintained manually, in the form of a search ontology. Recent work has also investigated methods to automatically acquire such metadata from sample documents [1,2]. In this work, we propose a new method to automatically evaluate the quality of such a search ontology.Our method relies on mapping ontology instances to textual documents. On the basis of this mapping, we evaluate the adequacy of ontology relations by measuring their classification potential over the textual documents. This data-driven method provides concrete feedback to ontology maintainers and a quantitative estimation of the functional adequacy of the ontology relations towards search experience improvement. We specifically evaluate whether an ontology relation can help the search engine support exploratory search in the form of effective facets.We test this ontology evaluation method on an ontology in the Movies domain, that has been acquired automatically from the integration of multiple semi-structured and textual data sources (e.g., IMDb and Wikipedia). We automatically construct a domain corpus from a set of movie instances by crawling the Web for movie reviews (both professional and user reviews). The 1-1 relation between textual documents (reviews) and movie instances in the ontology enables us to translate ontology relations into text classes. We verify that the text classifiers induced by key ontology relations (genre, keywords, actors) achieve high performance and exploit the properties of the learned text classifiers to provide concrete feedback on the ontology.The proposed ontology evaluation method is general: it only relies on the possibility to automatically align textual documents to ontology instances. 2Elhadad, Gabay and Netzer
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.