Laryngeal videoendoscopy is one of the main tools in clinical examinations for voice disorders and voice research. Using high-speed videoendoscopy, it is possible to fully capture the vocal fold oscillations, however, processing the recordings typically involves a time-consuming segmentation of the glottal area by trained experts. Even though automatic methods have been proposed and the task is particularly suited for deep learning methods, there are no public datasets and benchmarks available to compare methods and to allow training of generalizing deep learning models. In an international collaboration of researchers from seven institutions from the EU and USa, we have created BaGLS, a large, multihospital dataset of 59,250 high-speed videoendoscopy frames with individually annotated segmentation masks. The frames are based on 640 recordings of healthy and disordered subjects that were recorded with varying technical equipment by numerous clinicians. the BaGLS dataset will allow an objective comparison of glottis segmentation methods and will enable interested researchers to train their own models and compare their methods.
Objectives/Hypothesis: High-speed videoendoscopy (HSV) has potential to objectively quantify vibratory vocal fold characteristics during phonation. Glottal Analysis Tools (GAT) version 2018, developed in Erlangen, Germany, is software for determining various glottal area waveform (GAW) quantities. Before having GAT analyze HSV videos, segmenters have to define glottis manually across videos in a semiautomatic segmentation protocol. Such interventions are hypothesized to induce variability of subsequent GAW measure computation across segmenters and may attenuate GAT measures' reliability to a certain point. This study explored intersegmenter variability in GAT's GAW measures based on semiautomatic image processing. Study Design: Cohort study of rater reliability. Methods: In total, 20 HSV videos from normophonic and dysphonic subjects with various laryngeal disorders were selected for this study and segmented by three trained segmenters. They separately segmented glottis areas in the same frame sets of the videos. Upon analysis of GAW, GAT offers 46 measures related to topologic GAW dynamic characteristics, GAW periodicity and perturbation characteristics, and GAW harmonic components. To address GAT's reliability, intersegmenter-based variability in these measures was examined with intraclass correlation coefficient (ICC). Results: In general, ICC behavior of the 46 GAW measures across three raters was highly acceptable. ICC of one parameter was moderate (0.5 < ICC < 0.75), good for seven parameters (0.75 < ICC < 0.9), and excellent for 38 parameters (0.9 < ICC). Conclusions: Overall, high ICC values confirm clinical applicability of GAT for objective and quantitative assessment of HSV. Small intersegmenter differences with actual small parameter differences suggest that manual or semiautomatic segmentation in GAT does not noticeably influence clinical assessment outcome. To guarantee the software's performance, we suggest segmentation training before clinical application.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.