Whispered speech can be useful for quiet and private communication, and is the primary means of unaided spoken communication for many people experiencing voice-box deficiencies. Patients who have undergone partial or full laryngectomy are typically unable to speak anything more than hoarse whispers, without the aid of prostheses or specialized speaking techniques. Each of the current prostheses and rehabilitative methods for post-laryngectomized patients (primarily oesophageal speech, tracheo-esophageal puncture, and electrolarynx) have particular disadvantages, prompting new work on nonsurgical, noninvasive alternative solutions. One such solution, described in this paper, combines whisper signal analysis with direct formant insertion and speech modification located outside the vocal tract. This approach allows laryngectomy patients to regain their ability to speak with a more natural voice than alternative methods, by whispering into an external prosthesis, which then, recreates and outputs natural-sounding speech. It relies on the observation that while the pitch-generation mechanism of laryngectomy patients is damaged or unusable, the remaining components of the speech production apparatus may be largely unaffected. This paper presents analysis and reconstruction methods designed for the prosthesis, and demonstrates their ability to obtain natural-sounding speech from the whisper-speech signal using an external analysis-by-synthesis processing framework.
Whispered speech is a relatively common form of communications, used primarily to selectively exclude or include potential listeners from hearing a spoken message. Despite the everyday nature of whispering, and its undoubted usefulness in vocal communications, whispers have received relatively little research effort to date, apart from some studies analysing the main whispered vowels and some quite general estimations of whispered speech characteristics. In particular, a classic vowel space determination has been lacking for whispers. For voiced speech, this type of information has played an important role in the development and testing of recognition and processing theories over the past few decades, and can be expected to be equally useful for whisper-mode communications and recognition systems.This paper aims to redress the shortfall by presenting a vowel formant space for whispered speech, and comparing the results with corresponding phonated samples. In addition, since the study was conducted using speakers from Birmingham, the analysis extends to discuss the effect of the common British West Midlands (WM) accent in comparison with Standard English (RP). Thus, the paper presents the analysis of formant data showing differences between normal and whispered speech while also considering an accentual effect on whispered speech.
Whispering is a natural, unphonated, secondary aspect of speech communications for most people. However, it is the primary mechanism of communications for some speakers who have impaired voice production mechanisms, such as partial laryngectomees, as well as for those prescribed voice rest, which often follows surgery or damage to the larynx. Unlike most people, who choose when to whisper and when not to, these speakers may have little choice but to rely on whispers for much of their daily vocal interaction.Even though most speakers will whisper at times, and some speakers can only whisper, the majority of today's computational speech technology systems assume or require phonated speech. This article considers conversion of whispers into natural-sounding phonated speech as a noninvasive prosthetic aid for people with voice impairments who can only whisper. As a by-product, the technique is also useful for unimpaired speakers who choose to whisper.Speech reconstruction systems can be classified into those requiring training and those that do not. Among the latter, a recent parametric reconstruction framework is explored and then enhanced through a refined estimation of plausible pitch from weighted formant differences. The improved reconstruction framework, with proposed formant-derived artificial pitch modulation, is validated through subjective and objective comparison tests alongside state-of-the-art alternatives.
Psychiatrists rely on language and speech behavior as one of the main clues in psychiatric diagnosis. Descriptive psychopathology and phenomenology form the basis of a common language used by psychiatrists to describe abnormal mental states. This conventional technique of clinical observation informed early studies on disturbances of thought form, speech, and language observed in psychosis and schizophrenia. These findings resulted in language models that were used as tools in psychosis research that concerned itself with the links between formal thought disorder and language disturbances observed in schizophrenia. The end result was the development of clinical rating scales measuring severity of disturbances in speech, language, and thought form. However, these linguistic measures do not fully capture the richness of human discourse and are time-consuming and subjective when measured against psychometric rating scales. These linguistic measures have not considered the influence of culture on psychopathology. With recent advances in computational sciences, we have seen a re-emergence of novel research using computing methods to analyze free speech for improving prediction and diagnosis of psychosis. Current studies on automated speech analysis examining for semantic incoherence are carried out based on natural language processing and acoustic analysis, which, in some studies, have been combined with machine learning approaches for classification and prediction purposes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.