ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054204
|View full text |Cite
|
Sign up to set email alerts
|

Wawenets: A No-Reference Convolutional Waveform-Based Approach to Estimating Narrowband and Wideband Speech Quality

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
44
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 31 publications
(44 citation statements)
references
References 16 publications
0
44
0
Order By: Relevance
“…Table 4 presents the validation and test set results of the overall MOS prediction compared to the single-ended models P.563 [5], ANIQUE+ [36], WAWEnets [12], and the doubleended models POLQA [4], DIAL [37], and VISQOL (v3.1.0) [38] 3 . NISQA outperforms the other single-ended speech quality models on most of the datasets, except for the ITU-T Suppl.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Table 4 presents the validation and test set results of the overall MOS prediction compared to the single-ended models P.563 [5], ANIQUE+ [36], WAWEnets [12], and the doubleended models POLQA [4], DIAL [37], and VISQOL (v3.1.0) [38] 3 . NISQA outperforms the other single-ended speech quality models on most of the datasets, except for the ITU-T Suppl.…”
Section: Resultsmentioning
confidence: 99%
“…Recently, deep learning methods have been applied to build single-ended speech quality models [9][10][11][12][13][14][15][16][17] and showed to outperform traditional approaches without the need for a clean reference. In [18], we presented the deep learning model NISQA that predicts speech quality of super-wideband (SWB) speech samples.…”
Section: Introductionmentioning
confidence: 99%
“…It has to be noted that WEnets PESQ operates without reference signal, so the task for this measure is significantly more difficult than for all other measures. The original work reports Pearson correlation of 0.97 with PESQ [63], where training and testing signals were speech items processed by different speech codecs followed by noise suppression. This type of material fits PESQ original domain, but not many of our experiments, where, e.g.…”
Section: Results For the Artifacts-only Scores For The Source Separat...mentioning
confidence: 99%
“…with the aim of making them completely or partially non-intrusive [61]- [64]. Of the referenced works, only [63] provides the trained DNNs [65], referred to as Waveform Evaluation Networks (WEnets). These are four DNNs, trained to predict PESQ, POLQA, PEMO-Q, or Short-Time Objective Intelligibility (STOI), without reference signals.…”
Section: O Methods Based On Deep Learningmentioning
confidence: 99%
“…Another possibility to relax supervision is through the prediction or generation of pseudo ground truth. Although it is tempting to calculate the loss through a no-reference speech quality prediction network [5], experiments have shown that DNNs might over-optimize one perceptual metric without necessarily improving others [6,7], let alone a prediction of them. Wang et al used a pair of generative adversarial networks to map speech signals from noisy to clean [8].…”
Section: Introductionmentioning
confidence: 99%