2019
DOI: 10.48550/arxiv.1912.02671
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Audio-Visual Target Speaker Enhancement on Multi-Talker Environment using Event-Driven Cameras

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…Listening tests for speech DRT [255] 1983 Audio-only listening test using intelligibility assessment rhyming words HINT [191] 1994 Audio-only listening test using everyday sentences Matrix-like audio-visual 2019 Matrix test using audio-visual [178] test [178] stimuli [13] Estimators of speech quality PESQ [117], [119], [120], [214] 2001 Designed to assess quality across a [3], [5]- [7], [12], [17], [37], [55], [65] based on perceptual models wide range of codecs and network [66], [76], [77], [85], [99], [107], [108] conditions mostly for telephony [109], [122], [128], [136], [153], [154] [176], [178], [179], [183], [220]- [222] [239], [244], [263], [274], [279] CSIG / CBAK / COVRL [104] 2007 Composite measures which combine [108] basic objective measures HASQI [131], [133] 2010 Specifically designed for hearing- [99], [100] impaired listeners POLQA [1...…”
Section: Typementioning
confidence: 99%
See 1 more Smart Citation
“…Listening tests for speech DRT [255] 1983 Audio-only listening test using intelligibility assessment rhyming words HINT [191] 1994 Audio-only listening test using everyday sentences Matrix-like audio-visual 2019 Matrix test using audio-visual [178] test [178] stimuli [13] Estimators of speech quality PESQ [117], [119], [120], [214] 2001 Designed to assess quality across a [3], [5]- [7], [12], [17], [37], [55], [65] based on perceptual models wide range of codecs and network [66], [76], [77], [85], [99], [107], [108] conditions mostly for telephony [109], [122], [128], [136], [153], [154] [176], [178], [179], [183], [220]- [222] [239], [244], [263], [274], [279] CSIG / CBAK / COVRL [104] 2007 Composite measures which combine [108] basic objective measures HASQI [131], [133] 2010 Specifically designed for hearing- [99], [100] impaired listeners POLQA [1...…”
Section: Typementioning
confidence: 99%
“…[65] AAM of mouth region [136] 2D-DCT of mouth region [3]-[6] Optical flow [17], [65], [154], [164], [165] Landmark-based features [100], [154], [183], [203] Multisensory features [195] Face recognition embedding [55], [109], [169], [192], [239] VSR embedding [7], [10], [107]- [109], [153], [222], [273] Facial appearance embedding [42], [208] Compressed mouth frames [37] Speaker direction [85], [244], [279] Acoustic Features…”
Section: Audio-visual Speech Enhancement and Separation Systemsmentioning
confidence: 99%
“…Estimators of speech quality SNR -It does not provide a proper [12], [65], [66], [109] based on energy ratios (Signal-to-Noise Ratio) estimation of speech distortion SSNR / SSNRI -Assessment of short-time [100], [108], [239] (Segmental SNR) behaviour (SSNR Improvement) SDI [31] 2006 It provides a rough distortion [99], [100] measure SDR [252] 2006 Specifically designed for blind audio [7], [10], [17], [42], [55], [65], [85] source separation [107]- [109], [136], [153], [154], [169] [164], [165], [183], [192], [195], [203] [208], [220]-[222] SIR [252] 2006 Specifically designed for blind audio [7], [65], [107], [136], [164], [165] source separation [195] SAR [252] 2006 Specifically designed for blind audio [65], [107], [136], [164], [165], [195] source separation SI-SDR [150]…”
Section: Ip Transmissionmentioning
confidence: 99%
“…Magnitude spectrogram [3]- [7], [10], [12], [17], [37], [42], [65], [66], [76], [77], [85], [99], [100], [107], [122], [128], [136], [153], [154], [164] [165], [176], [178], [179], [183], [192], [195], [203], [208], [220]- [222], [244], [263], [274], [279] Phase a…”
Section: Audio-visual Speech Enhancement and Separation Systemsmentioning
confidence: 99%
See 1 more Smart Citation