The loss in performance caused by mismatch between train and test material suggests a need for task specific acoustic models, especially for highly demanding tasks. However, since the training of these models is extremely expensive, general pur pose models are more attractive. In this paper we address the impact of mismatch in speaking style and task. We trained three sets of acoustic models on data from different tasks, involving both read and extemporaneous speech. The average utterance length in the training corpora varied between 10.5 and 1.2 words. The models were tested on matched as well on very dif ferent tasks. The results suggest that general purpose models trained from short utterances are to be preferred in most spoken dialog systems. However, these models might not perform ade quately in dictation tasks.
The following full text is an author's version which may differ from the publisher's version.For additional information about this publication click this link. http://hdl.handle.net/2066/75039Please be advised that this information was generated on 2022-08-08 and may be subject to change.
We present the evaluation of the most recent version of the Dutch ARISE train timetable information system. The original version of this spoken dialogue system has been adjusted according to the findings of two user tests [1,2]. The new version applies a mixture of implicit and explicit confirmation of information items, based on confidence measures. In addition, the negotiation part of the dialogue tells the user explicitly what he can ask. Furthermore, the exceptions handling was made very explicit. The new dialogue has been evaluated by 25 experts and by 200 anonymous calls to the system. To be able to compare the two versions of the system, the same scenarios as in [1] were used. It was shown that the mixture of implicit and explicit confirmation results in shorter dialogues and in slightly higher dialogue success rates. Also, we observed a better performance in the negotiation part of the dialogue. However, the shortcomings of working with scenarios are once again made clear.
There are very few, if any, published accounts of practical expe rience with Speaker Verification as a means to provide secure access to telematics services. Yet, there is no reason to expect that Speaker Verification is very different from speech recogni tion, for which many deployed services have shown the need for close and intensive on-line monitoring during the time when the service becomes operational. In this paper we present our expe rience with a monitoring scheme for Speaker Verification dur ing the field test of a financial investment game. Many of the issues that were monitored were suggested by our experience with a semi-operational service, viz. free access to Directory Assistance for visually impaired. A newly developed enrolment procedure, that can flag potentially weak speaker models, is an essential part of the monitoring procedure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.