Non-Contact Speech Recovery Technology Using a 24 GHz Portable Auditory Radar and Webcam

Ma, Yue; Hong, Hong; Li, Hui; Zhao, Heng; Li, Yusheng; Sun, Li; Gu, Chunhua; Zhu, Xiaohua

doi:10.3390/rs12040653

Cited by 6 publications

(2 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…At present, the present technologies for obtaining speech signals can be divided into air conduction and non-air conduction detection. However, the shortcomings limit the development of these techniques for speech detection [1][2][3][4][5]. In recent years, bio-radar technology has been developed for using in a variety of remote sensing applications [6,7].…”

Section: Introductionmentioning

confidence: 99%

94 GHz Asymmetric Antenna Radar for Speech Signal Detection and Enhancement via Variational Mode Decomposition and Improved Threshold Strategy

Chen

Wang

2022

IEEE Access

View full text Add to dashboard Cite

To further improve the detection distance and sensitivity of bio-radar, a 94 GHz asymmetric antenna radar sensor is employed to detect speech signal. However, the radar speech is often mixed with various noise, which will seriously affect the quality and intelligibility of the speech signal. Therefore, a novel method based on variational mode decomposition (VMD) and improved threshold strategy (ITS) is proposed in this paper for improving the quality and intelligibility of the radar speech. VMD is a novel adaptive decomposition method, which overcomes the problem of mode aliasing and end effect in empirical mode decomposition (EMD). ITS can overcome the limitation of traditional wavelet threshold and achieve the best compromise between speech intelligibility and noise reduction. Firstly, EMD is applied to determine the number of decomposition level, and then radar speech is decomposed into several limited bandwidth intrinsic mode functions by VMD. Secondly, ITS is employed to remove noise from useful modes which are determined by Pearson correlation coefficient (PCC). The performance of the proposed method is evaluated by perceptual evaluation of speech quality (PESQ), short-time objective intelligibility (STOI) and composite measures (CMs). The experimental results show that the radar sensor can detect long distance speech signal and the proposed method can effectively improve the quality and intelligibility of the radar speech signal. Due to the good performance, the proposed method will provide a promising alternative for various applications related to radar speech and traditional microphone speech signal enhancement.INDEX TERMS 94 GHz asymmetric antenna; radar speech; speech enhancement; variational mode decomposition; improved threshold strategy; composite measure

show abstract

Section: Introductionmentioning

confidence: 99%

94 GHz Asymmetric Antenna Radar for Speech Signal Detection and Enhancement via Variational Mode Decomposition and Improved Threshold Strategy

Chen

Wang

2022

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Based on recent advances in machine learning-based technologies, the conversion of biosignals to speech signals has been reported in several studies [3], [4], [5]. Various signals have been considered for speech generation and enhancement, including surface electromyography (sEMG) [3], [6], electromagnetic articulography (EMA) [4], [7], permanent magnetic articulography (PMA) [5], [8], ultrasound tongue imaging [9], [10], Doppler signals [11], [12], visual cues [13], [14], and bone-conducted microphone signals [15]. Further, multimodal learning has been leveraged to integrate information from complementary data, such as text [16], videos [13], boneconducted microphone signals [15], and articulatory movements [4].However, the transformation of articulatory movements to facilitate communication has not yet been adequately researched.…”

mentioning

confidence: 99%

EPG2S: Speech Generation and Speech Enhancement Based on Electropalatography and Audio Signals Using Multimodal Learning

Chen

Tsai

et al. 2022

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Speech generation and enhancement based on articulatory movements facilitate communication when the scope of verbal communication is absent, e.g., in patients who have lost the ability to speak. Although various techniques have been proposed to this end, electropalatography (EPG), which is a monitoring technique that records contact between the tongue and hard palate during speech, has not been adequately explored. Herein, we propose a novel multimodal EPG-to-speech (EPG2S) system that utilizes EPG and speech signals for speech generation and enhancement. Different fusion strategies based on multiple combinations of EPG and noisy speech signals are examined, and the viability of the proposed method is investigated. Experimental results indicate that EPG2S achieves desirable speech generation outcomes based solely on EPG signals. Further, the addition of noisy speech signals is observed to improve quality and intelligibility. Additionally, EPG2S is observed to achieve highquality speech enhancement based solely on audio signals, with the addition of EPG signals further improving the performance. The late fusion strategy is deemed to be the most effective approach for simultaneous speech generation and enhancement.

show abstract

Non-contact far-field speech detection

Farhat,

Djerafi

2024

2024 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)

View full text Add to dashboard Cite

Non-Contact Speech Recovery Technology Using a 24 GHz Portable Auditory Radar and Webcam

Cited by 6 publications

References 30 publications

94 GHz Asymmetric Antenna Radar for Speech Signal Detection and Enhancement via Variational Mode Decomposition and Improved Threshold Strategy

94 GHz Asymmetric Antenna Radar for Speech Signal Detection and Enhancement via Variational Mode Decomposition and Improved Threshold Strategy

EPG2S: Speech Generation and Speech Enhancement Based on Electropalatography and Audio Signals Using Multimodal Learning

Non-contact far-field speech detection

Contact Info

Product

Resources

About