1995
DOI: 10.1073/pnas.92.22.10040
|View full text |Cite
|
Sign up to set email alerts
|

Toward the ultimate synthesis/recognition system.

Abstract: This paper predicts speech synthesis, speech recognition, and speaker recognition technology for the year 2001, and it describes the most important research problems to be solved in order to arrive at these ultimate synthesis and recognition systems. The problems for speech synthesis include natural and intelligible voice production, prosody control based on meaning, capability of controlling synthesized voice quality and choosing individual speaking style, multilingual and multidialectal synthesis, choice of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

1995
1995
2015
2015

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 18 publications
0
2
0
Order By: Relevance
“…As computerized technology becomes an ever greater fixture at home and at work, our future interactions with it will need to become even more sophisticated ( Wendemuth and Biundo, 2011 ; Honold et al, 2014 ). Some time ago, it was recommended that artificial speech synthesis technology should not only have the ability to control prosody based on meaning, but also the capability to control individual speaking style (another form of prosody), choosing application-oriented speaking styles, and be able to add emotion ( Furui, 1995 ). Yet, as we have seen, there remains much work to be done ( Burkhardt and Stegmann, 2009 ).…”
Section: What About the Future?mentioning
confidence: 99%
See 1 more Smart Citation
“…As computerized technology becomes an ever greater fixture at home and at work, our future interactions with it will need to become even more sophisticated ( Wendemuth and Biundo, 2011 ; Honold et al, 2014 ). Some time ago, it was recommended that artificial speech synthesis technology should not only have the ability to control prosody based on meaning, but also the capability to control individual speaking style (another form of prosody), choosing application-oriented speaking styles, and be able to add emotion ( Furui, 1995 ). Yet, as we have seen, there remains much work to be done ( Burkhardt and Stegmann, 2009 ).…”
Section: What About the Future?mentioning
confidence: 99%
“…Additional work on the social skills and responsivity with which HCI-AI are programmed will likely increase the empathy and acceptance level of interactions further ( Leite et al, 2013 ). From the human interface point of view, it has long been recognized that HCI-AI should be able to automatically acquire new knowledge about the thinking process of individual users, automatically correct user errors, and understand user intentions by accepting rough instructions and inferring details ( Furui, 1995 ). Ultimately, the hope for the future is that HCI-AI could extract the prosodic cues from a user’s speech, capitalize on the information to inform predictive models of likely emotions ( Litman and Forbes-Riley, 2006 ), and amend their own displays and actions accordingly.…”
Section: What About the Future?mentioning
confidence: 99%