2022
DOI: 10.48550/arxiv.2204.11550
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Speech Detection For Child-Clinician Conversations In Danish For Low-Resource In-The-Wild Conditions: A Case Study

Abstract: Use of speech models for automatic speech processing tasks can improve efficiency in the screening, analysis, diagnosis and treatment in medicine and psychiatry. However, the performance of pre-processing speech tasks like segmentation and diarization can drop considerably on in-the-wild clinical data, specifically when the target dataset comprises of atypical speech. In this paper we study the performance of a pre-trained speech model on a dataset comprising of child-clinician conversations in Danish with res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 7 publications
0
3
0
Order By: Relevance
“…We will use pretrained voice activation and diarization models that will be fine-tuned with a few manually annotated ground-truth labels of speech and nonspeech samples from our data set [ 21 ]. We follow this approach to ensure a balance between resources spent in training a model while maintaining high accuracy in the obtained speaker segments.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We will use pretrained voice activation and diarization models that will be fine-tuned with a few manually annotated ground-truth labels of speech and nonspeech samples from our data set [ 21 ]. We follow this approach to ensure a balance between resources spent in training a model while maintaining high accuracy in the obtained speaker segments.…”
Section: Methodsmentioning
confidence: 99%
“…These audio segments will be selected on the basis of insightful segments in the interview; for instance, questions in the interview associated with depression. In total, 10 interviews—5 with youth with OCD and 5 with those with no psychiatric diagnosis—have already been used to develop an appropriate method for speaker segmentation [ 21 ]. We will add the remaining 54 observations once we commence the described analysis.…”
Section: Methodsmentioning
confidence: 99%
“…• Is there a way to create content privacy solutions that are reliable enough for very sensitive speech (e.g., speech used in medical research) to be useful in other tasks or at least shared between institutions without having to be stored on special servers while protecting the rights of individuals? → This is particularly important for large databases of speech, especially ones that might be considered sensitive (e.g., medical child speech [25]). Increased data sharing, when done properly, can result in large gains for people who stand to benefit most from speech and audio technology.…”
Section: Impacts On Downstream Tasksmentioning
confidence: 99%