This paper describes a project to detect dependencies between Japanese phrasal units called bunsetsus, and sentence boundaries in a spontaneous speech corpus. In monologues, the biggest problem with dependency structure analysis is that sentence boundaries are ambiguous. In this paper, we propose two methods for improving the accuracy of sentence boundary detection in spontaneous Japanese speech: One is based on statistical machine translation using dependency information and the other is based on text chunking using SVM. An F-measure of 84.9 was achieved for the accuracy of sentence boundary detection by using the proposed methods. The accuracy of dependency structure analysis was also improved from 75.2% to 77.2% by using automatically detected sentence boundaries. The accuracy of dependency structure analysis and that of sentence boundary detection were also improved by interactively using both automatically detected dependency structures and sentence boundaries.
A new method for automatic detection of section boundaries and extraction of key sentences from lecture audio archives is proposed. The method makes use of 'discourse markers' (DMs), which are characteristic expressions used in initial utterances of sections, together with pause and language model information. The DMs are derived in a totally unsupervised manner based on word statistics. An experimental evaluation using the Corpus of Spontaneous Japanese (CSJ) demonstrates that the proposed method provides better indexing of section boundaries compared with a simple baseline method using pause information only, and that it is robust against speech recognition errors. The method is also applied to extraction of key sentences that can index the section topics. The statistics of the presumed DMs are used to define the importance of sentences, which favors potentially section-initial ones. The measure is also combined with the conventional tf-idf measure based on content words. Experimental results confirm the effectiveness of using the DMs in combination with the keyword-based method. The paper also describes a statistical framework for transforming raw speech transcriptions into the document style for defining appropriate sentence units and improving readability.
We have developed an active listening system for a conversation robot, specifically for reminiscing. The aim of the system is to contribute to the prevention of dementia in elderly persons and to reduce loneliness in seniors living alone. Based on the speech recognition results from a user's utterance, the proposed system produces backchannel feedback, repeats the user's utterance and asks information about predicates that were not included in the original utterance. Moreover, the system produces an appropriate empathic response by estimating the user's emotion from their utterances. One of the features of our system is that it can determine an appropriate response even if the speech recognition results contain some errors. Our results show that the conversations of 45.5% of the subjects (n = 110) with this robot continued for more than two minutes on the topic "memorable trip". The system response was deemed correct for about 77% of user utterances. Based on the results of a questionnaire, positive evaluations of the system were given by the elderly subjects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.