ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
DOI: 10.1109/icassp49357.2023.10094784
|View full text |Cite
|
Sign up to set email alerts
|

On Word Error Rate Definitions and Their Efficient Computation for Multi-Speaker Speech Recognition Systems

Abstract: MeetEval is an open-source toolkit to evaluate all kinds of meeting transcription systems. It provides a unified interface for the computation of commonly used Word Error Rates (WERs), specifically cpWER, ORC WER and MIMO WER along other WER definitions. We extend the cpWER computation by a temporal constraint to ensure that only words are identified as correct when the temporal alignment is plausible. This leads to a better quality of the matching of the hypothesis string to the reference string that more clo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(1 citation statement)
references
References 27 publications
0
1
0
Order By: Relevance
“…where S, D, I, and C are the number of substitutions, deletions, insertions, and correct words in the estimated sequence, respectively, and N is the number of words in the true sequence (N = S + D + C). The WER is based on the Levenshtein distance and a smaller value signifies a closer approximation between the estimated word sequence and the ground truth transcription [34]. It is important to notice that, while the WER is usually presented as a value typically between 0 and 100, it does not represent a true percentage.…”
Section: Evaluation Metricmentioning
confidence: 99%
“…where S, D, I, and C are the number of substitutions, deletions, insertions, and correct words in the estimated sequence, respectively, and N is the number of words in the true sequence (N = S + D + C). The WER is based on the Levenshtein distance and a smaller value signifies a closer approximation between the estimated word sequence and the ground truth transcription [34]. It is important to notice that, while the WER is usually presented as a value typically between 0 and 100, it does not represent a true percentage.…”
Section: Evaluation Metricmentioning
confidence: 99%