Enhanced PESQ algorithm for objective assessment of speech quality at a continuous varying delay

Shiran, Nitay; Shallom, Ilan D.

doi:10.1109/qomex.2009.5246960

Cited by 12 publications

(7 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Focusing on the frame-by-frame time alignment stage of PESQ, [15] noted that subjective scores may be poorly correlated as a result of errors in the objective quality scores caused by a few misaligned frames. Whereas, [16] discovered that PESQ time alignment failed to align continuous variable delays particularly with speech signals that have high packet loss rate and for which dynamic time processing is exhibited due to its piecewise constant delay estimation. The result of Malfait et al's work achieved a near-perfect delay profile in which for a misalignment of 10 ms, they obtained a correlation of 0.93 with the subjective score, for misalignment less than 5ms they obtained a correlation of 0.973, and have no significant improvement in the correlation coefficient for misalignment down to about 1 ms.…”

Section: Review Of Pesq's Limitationsmentioning

confidence: 99%

“…They concluded that a time alignment of ±5 ms seemed good enough for correct assessment of time-warped signals. But [16] developed a new time-alignment algorithm that identifies both fix and variable delays in speech signals by using Dynamic Time Warping (DTW) in place of the utterances correlation and splitting methods used in the original PESQ algorithm.…”

Section: Review Of Pesq's Limitationsmentioning

confidence: 99%

See 1 more Smart Citation

An Improved Logistic Function for Mapping Raw Scores of Perceptual Evaluation of Speech Quality (PESQ)

Olatubosun

Olabisi

2018

JERR

View full text Add to dashboard Cite

Voice service being the major offering of telecommunication networks, its level of Quality of Service (QoS) largely determines the performance of these networks. This work evaluated the state-of-the-art Perceptual Evaluation of Speech Quality (PESQ) objective model for perceptual estimation of the quality of transmitted speech signals. Perceptual estimation of the quality of speech is predominantly done by subjective techniques and the results presented as Mean Opinion Scores (MOS), which has a scale from 1 for poor quality to 5 for excellent quality. Despite constraints of the subjective approach to perceptual speech quality estimation, its scores serves as the basis for correlating quality scores from objective techniques for speech quality estimation. Original or reference speeches were recorded using professional studio equipment and software, and guided by provisions of ITU-T P.830. The speeches were transmitted over three mobile wireless networks. A speech database consisting of 64 original (32 male and 32 female) and 192 transmitted speeches was developed. Reference speeches and their corresponding transmitted (network-degraded) speeches were tested on the PESQ model to estimate their quality scores. The raw PESQ quality scores are within the scale range of -0.5 and 4.5. They were mapped to the MOS scale for linear comparison of the scales. Study of PESQ model showed several shortcomings, some of which have been improved upon by previous researchers. Evaluating PESQ mapping function (in ITU-T Rec P.862.1) showed the need for better coverage of the MOS scale. Analysis of solution for the logistic growth function was done and parameters were optimised which resulted in the development of a new robust logistic mapping function. The raw PESQ quality scores were mapped using the developed mapping function as well as two known standard mapping functions, namely: ITU-T P.862.1 and Morfitt and Cotanis mapping functions. The mapped scores known as PESQ MOS-listening quality objective (PESQ MOS-LQO) obtained with the three functions were tested using ANOVA at a significant figure of . The developed logistic mapping function offered a quality score coverage of 98.6% of the MOS scale. This was evaluated against the two known standard mapping functions and the developed function offered improvement of 11.8 and 4.9% over and above their 86.8 and 93.7% coverage of the MOS scale respectively. At the significance level of , an F-value of 60.6042, a critical-F of 3.04, and a p-value of 4.61721E-21 were obtained. With p < 0.05, the Null Hypothesis was rejected, and the critical-F value being less than the F-statistic value confirmed the rejection. Therefore, the data distribution of at least one of the functions has a different mean and belongs to a separate population of performance.

show abstract

Section: Review Of Pesq's Limitationsmentioning

confidence: 99%

Section: Review Of Pesq's Limitationsmentioning

confidence: 99%

An Improved Logistic Function for Mapping Raw Scores of Perceptual Evaluation of Speech Quality (PESQ)

Olatubosun

Olabisi

2018

JERR

View full text Add to dashboard Cite

show abstract

“…It needs the reference signal and the degraded signal and its major steps are: level and time alignment, equalization, auditory transform, disturbance processing, cognitive modelling, and MOS (Mean Opinion Score) prediction. Other variations have also been defined as F-PESQ (Framed PESQ) [3] and E-PESQ (Enhanced PESQ) [14].…”

Section: Speech Quality Evaluationmentioning

confidence: 99%

Reference-Free Speech Quality Assessment for Mobile Phones Based on Audio Perception

Mello

Albuquerque

2015

2015 IEEE International Conference on Systems, Man, and Cybernetics

View full text Add to dashboard Cite

Speech quality evaluation is a very complex task with important applications. With the deployment of 3G and 4G networks, the end-user is requiring more quality of service. Currently, the assessment of speech quality over mobile phones is done in a controlled environment with a reference clean signal and its distorted version. However, this does not evaluate the quality of speech in a real-time situation as it demands a reference signal. In this paper, it is introduced a new method to classify a speech signal over a mobile device as distorted or not. The degraded signal is evaluated and its quality is scored similarly as what is done by PESQ (Perceptual Evaluation of Speech Quality). The experiments proved that the method is quite efficient with an average difference of 12% comparing to PESQ scores which is based on a reference clean signal.

show abstract

“…The first is a time alignment stage that aligns the separated signal and reference signal. In the next stage a psychoacoustics model is used to calculate an auditory representation of the signals, followed by a cognitive model that calculates final score based on the differences between signals [7]. Formula (4) represents segmental version of SNR (SNRS), what is time domain measure.…”

Section: Measures For Intelligibility Assessmentmentioning

confidence: 99%

Intelligibility Assessment of Ideal Binary-Masked Noisy Speech with Acceptance of Room Acoustic

Sedlák¹,

Durackova²,

Roman³

et al. 2015

Journal of Electrical Engineering

View full text Add to dashboard Cite

In this paper the intelligibility of ideal binary-masked noisy signal is evaluated for different signal to noise ratio (SNR), mask error, masker types, distance between source and receiver, reverberation time and local criteria for forming the binary mask. The ideal binary mask is computed from time-frequency decompositions of target and masker signals by thresholding the local SNR within time-frequency units. The intelligibility of separated signal is measured using different objective measures computed in frequency and perceptual domain. The present study replicates and extends the findings which were already presented but mainly shows impact of room acoustic on the intelligibility performance of IBM technique.

show abstract

Enhanced PESQ algorithm for objective assessment of speech quality at a continuous varying delay

Cited by 12 publications

References 6 publications

An Improved Logistic Function for Mapping Raw Scores of Perceptual Evaluation of Speech Quality (PESQ)

An Improved Logistic Function for Mapping Raw Scores of Perceptual Evaluation of Speech Quality (PESQ)

Reference-Free Speech Quality Assessment for Mobile Phones Based on Audio Perception

Intelligibility Assessment of Ideal Binary-Masked Noisy Speech with Acceptance of Room Acoustic

Contact Info

Product

Resources

About