The WissKI system provides a framework for ontologybased science communication and cultural heritage documentation. In many cases, the documentation consists of semi-structured data records with free text fields. Most references in the texts comprise of person and place names, as well as time specifications. We present the WissKI tools for semantic annotation using controlled vocabularies and formal ontologies derived from CIDOC Conceptual Reference Model (CRM). Current research deals with the annotations as building blocks for event recognition. Finally, we outline how the CRM helps to build bridges between documentation in different scientific disciplines.
In many cases, museum documentation consists of semi-structured data records with free text fields, which usually refer to contents of other fields, in the same data record, as well as in others. Most of these references comprise of person and place names, as well as time specifications. It is, therefore, important to recognize those in the first place. We report on techniques and results of partial parsing in an ongoing project, using a large database on German goldsmith art. The texts are encoded according to the TEI guidelines and expanded by structured descriptions of named entities and time specifications. These are building blocks for event descriptions, at which the next step is aiming. The identification of named entities allows the data to be linked with various resources within the domain of cultural heritage and beyond. For the latter case, we refer to a biological database and present a solution in a transdisciplinary perspective by means of the CIDOC Conceptual Reference Model (CRM).
Speech repairs occur often in spontaneous spoken dialogues. The ability to detect and correct those repairs is necessary for any spoken language system. We present a framework to detect and correct speech repairs where all relevant levels of information, i.e., acoustics, lexis, syntax and semantics can be integrated. The basic idea is to reduce the search space for repairs as soon as possible by cascading filters that involve more and more features. At first an acoustic module generates hypotheses about the existence of a repair. Second a stochastic model suggests a correction for every hypothesis. Well scored corrections are inserted as new paths in the word lattice. Finally a lattice parser decides on accepting the repair.
GLP is a general linguistic processor for the analysis and generation of natural language. It will be integrated into a speech understanding system for continuously spoken German language which is currently under development at the Computer Science Department, University of Erlangen-Nuernberg (Hein (1980) and this issue).
In this paper we describe an approach to automatic evaluation of both the speech recognition and understanding capabilities of a spoken dialogue system for train time table information. We use word a c curacy for recognition and concept accuracy for understanding performance judgement. Both measures are calculated by comparing these modules' output with a correct reference answer. We report evaluation results for a spontaneous speech corpus with about 10000 utterances. We observed a nearly linear relationship between word accuracy and concept accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.