Peng Hou scite author profile

Peng Hou

5Publications

70Citation Statements Received

60Citation Statements Given

How they've been cited

126

How they cite others

Affiliations

Dalian University of Technology, Xidian University, Northwestern Polytechnical University

Publications

Order By: Most citations

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

Zhao

Wang

et al. 2020

AAAI

View full text Add to dashboard Cite

Lip reading has witnessed unparalleled development in recent years thanks to deep learning and the availability of large-scale datasets. Despite the encouraging results achieved, the performance of lip reading, unfortunately, remains inferior to the one of its counterpart speech recognition, due to the ambiguous nature of its actuations that makes it challenging to extract discriminant features from the lip movement videos. In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers. The rationale behind our approach is that the features extracted from speech recognizers may provide complementary and discriminant clues, which are formidable to be obtained from the subtle movements of the lips, and consequently facilitate the training of lip readers. This is achieved, specifically, by distilling multi-granularity knowledge from speech recognizers to lip readers. To conduct this cross-modal knowledge distillation, we utilize an efficacious alignment scheme to handle the inconsistent lengths of the audios and videos, as well as an innovative filtering strategy to refine the speech recognizer's prediction. The proposed method achieves the new state-of-the-art performance on the CMLR and LRS2 datasets, outperforming the baseline by a margin of 7.66% and 2.75% in character error rate, respectively.

show abstract

Robots using environment objects as tools the ‘MacGyver’ paradigm for mobile manipulation

Stilman

Zafar

Erdoğan

et al. 2014

View full text Add to dashboard Cite

Multi-target tracking algorithm based on PHD filter against multi-range-false-target jamming

Chen

Pei

Hou

et al. 2020

J. of Syst. Eng. Electron.

View full text Add to dashboard Cite

Multi-range-false-target (MRFT) jamming is particularly challenging for tracking radar due to the dense clutter and the repeated multiple false targets. The conventional association-based multi-target tracking (MTT) methods suffer from high computational complexity and limited usage in the presence of MRFT jamming. In order to solve the above problems, an efficient and adaptable probability hypothesis density (PHD) filter is proposed. Based on the gating strategy, the obtained measurements are firstly classified into the generalized newborn target and the existing target measurements. The two categories of measurements are independently used in the decomposed form of the PHD filter. Meanwhile, an amplitude feature is used to suppress the dense clutter. In addition, an MRFT jamming suppression algorithm is introduced to the filter. Target amplitude information and phase quantization information are jointly used to deal with MRFT jamming and the clutter by modifying the particle weights of the generalized newborn targets. Simulations demonstrate the proposed algorithm can obtain superior correct discrimination rate of MRFT, and high-accuracy tracking performance with high computational efficiency in the presence of MRFT jamming in the dense clutter.

show abstract

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

Zhao

Wang

et al. 2019

Preprint

View full text Add to dashboard Cite

Lip reading has witnessed unparalleled development in recent years thanks to deep learning and the availability of largescale datasets. Despite the encouraging results achieved, the performance of lip reading, unfortunately, remains inferior to the one of its counterpart speech recognition, due to the ambiguous nature of its actuations that makes it challenging to extract discriminant features from the lip movement videos. In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers. The rationale behind our approach is that the features extracted from speech recognizers may provide complementary and discriminant clues, which are formidable to be obtained from the subtle movements of the lips, and consequently facilitate the training of lip readers. This is achieved, specifically, by distilling multigranularity knowledge from speech recognizers to lip readers. To conduct this cross-modal knowledge distillation, we utilize an efficacious alignment scheme to handle the inconsistent lengths of the audios and videos, as well as an innovative filtering strategy to refine the speech recognizer's prediction. The proposed method achieves the new state-of-the-art performance on the CMLR and LRS2 datasets, outperforming the baseline by a margin of 7.66% and 2.75% in character error rate, respectively.

show abstract

Long-term morphodynamic evolution in the Modaomen Estuary of the Pearl River Delta, South China

Liu

Duan

et al. 2022

Geomorphology

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Peng Hou

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

Robots using environment objects as tools the ‘MacGyver’ paradigm for mobile manipulation

Multi-target tracking algorithm based on PHD filter against multi-range-false-target jamming

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

Long-term morphodynamic evolution in the Modaomen Estuary of the Pearl River Delta, South China

Contact Info

Product

Resources

About

Peng Hou

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

Robots using environment objects as tools the &#x2018;MacGyver&#x2019; paradigm for mobile manipulation

Multi-target tracking algorithm based on PHD filter against multi-range-false-target jamming

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

Long-term morphodynamic evolution in the Modaomen Estuary of the Pearl River Delta, South China

Contact Info

Product

Resources

About

Robots using environment objects as tools the ‘MacGyver’ paradigm for mobile manipulation