Automatic Speech Recognition Errors Detection and Correction: A Review

Errattahi, Rahhal; Hannani, Asmaa El; Ouahmane, Hassan

doi:10.1016/j.procs.2018.03.005

Cited by 122 publications

(80 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…This could be addressed by telephone contact, but one unexpected outcome of using instant messaging was the active use by some patients of session transcripts as personalised psychoeducation materials. This benefit would be lost, although the latest advances in voice recognition and automatic transcription may eventually render this issue obsolete [20,25,81]. Access to transcripts may be particularly helpful for patients who do not engage with formal worksheets and could be actively encouraged by therapists in such situations.…”

Section: The Choice Of Communications Modesmentioning

confidence: 99%

Integrating the Digital and the Traditional to Deliver Therapy for Depression

Stawarz

Preist

Tallon

et al. 2020

Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

Traditional approaches to psychotherapy emphasise face-toface contact between patients and therapists. In contrast, current computerised approaches tend to minimise this contact. This can limit the range of mental health difficulties for which computerised approaches are effective. Here, we explore an alternative approach that integrates face-to-face contact, electronic contact, online collaboration, and support for betweensession activities. Our discussion is grounded in the design of a platform to deliver psychotherapy for depression. We report findings of an 11-month pragmatic study in which 17 patients received treatment for depression via the platform. Results show how design decisions had a significant impact on the dynamics of therapeutic sessions and the establishment of patient-therapist relationships. For example, the use of instant messaging for synchronous, in-session contact slowed communication, but also provided a valuable space for reflection and helped to maintain session focus. We discuss the impact of flexibility and the potential of integrated approaches to both enhance and reduce patient engagement.

show abstract

Section: The Choice Of Communications Modesmentioning

confidence: 99%

Integrating the Digital and the Traditional to Deliver Therapy for Depression

Stawarz

Preist

Tallon

et al. 2020

Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

show abstract

“…machine translation, natural language processing). [20] presents an overview of previous work on error correction for ASR. However, most of researches were limited to the detection [21,22,23] and just few researches addressed the correction process of ASR errors.…”

Section: Related Workmentioning

confidence: 99%

Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition

2019

View full text Add to dashboard Cite

Connectionist Temporal Classification (CTC) based end-to-end speech recognition system usually need to incorporate an external language model by using WFST-based decoding in order to achieve promising results. This is more essential to Mandarin speech recognition since it owns a special phenomenon, namely homophone, which causes a lot of substitution errors. The linguistic information introduced by language model is somehow helpful to distinguish these substitution errors. In this work, we propose a transformer based spelling correction model to automatically correct errors, especially the substitution errors, made by CTC-based Mandarin speech recognition system. Specifically, we investigate to use the recognition results generated by CTC-based systems as input and the ground-truth transcriptions as output to train a transformer with encoder-decoder architecture, which is much similar to machine translation. Experimental results in a 20,000 hours Mandarin speech recognition task show that the proposed spelling correction model can achieve a CER of 3.41%, which results in 22.9% and 53.2% relative improvement compared to the baseline CTC-based systems decoded with and without language model, respectively.

show abstract

“…The vocal cord generates sounds via disc-platelets and a series of vibrations, which give the production a speech signal, as air is breathed from the lungs. Speech processing in many specialized workstations incorporating software and telephony will make use of the growing overlap between information processing and conventional transport of information [1][2]. More recently, the interest in automatically collecting vast quantities of voice data was growing to establish not only what was being said, but also how and by whom [3].…”

Section: Introductionmentioning

confidence: 99%

An Appraisal on Speech and Emotion Recognition Technologies based on Machine Learning

Kumar¹,

Jason²

2020

IJRTE

View full text Add to dashboard Cite

In earlier days, people used speech as a means of communication or the way a listener is conveyed by voice or expression. But the idea of machine learning and various methods are necessary for the recognition of speech in the matter of interaction with machines. With a voice as a bio-metric through use and significance, speech has become an important part of speech development. In this article, we attempted to explain a variety of speech and emotion recognition techniques and comparisons between several methods based on existing algorithms and mostly speech-based methods. We have listed and distinguished speaking technologies that are focused on specifications, databases, classification, feature extraction, enhancement, segmentation and process of Speech Emotion recognition in this paper

show abstract

Automatic Speech Recognition Errors Detection and Correction: A Review

Cited by 122 publications

References 12 publications

Integrating the Digital and the Traditional to Deliver Therapy for Depression

Integrating the Digital and the Traditional to Deliver Therapy for Depression

Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition

An Appraisal on Speech and Emotion Recognition Technologies based on Machine Learning

Contact Info

Product

Resources

About