2023
DOI: 10.3390/app13063927
|View full text |Cite
|
Sign up to set email alerts
|

Utilizing Machine Learning for Detecting Harmful Situations by Audio and Text

Abstract: Children with special needs may struggle to identify uncomfortable and unsafe situations. In this study, we aimed at developing an automated system that can detect such situations based on audio and text cues to encourage children’s safety and prevent situations of violence toward them. We composed a text and audio database with over 1891 sentences extracted from videos presenting real-world situations, and categorized them into three classes: neutral sentences, insulting sentences, and sentences indicating un… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 40 publications
0
4
0
Order By: Relevance
“…Paper [23] uses the NLU model to classify each utterance of a transcribed call into a finite set of labels. In [24], the authors construct a system for children with special needs that detects insulting and harmful speech in the context of a dialogue. They classified such speech into three types of sentences: insulting, harmful, and neutral.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Paper [23] uses the NLU model to classify each utterance of a transcribed call into a finite set of labels. In [24], the authors construct a system for children with special needs that detects insulting and harmful speech in the context of a dialogue. They classified such speech into three types of sentences: insulting, harmful, and neutral.…”
Section: Related Workmentioning
confidence: 99%
“…Paper [27] extensively reviews related methods. Moreover, both textual and audio features can be used, as described in [24,[28][29][30].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The proposed architecture was shown to be effective in learning rich semantic and contextual representations of news for the detection task. The authors in [16] devised a text and audio database with sentences extracted from videos presenting real-world situations and categorized them into three classes: neutral sentences, insulting sentences, and sentences representing unsafe conditions. A deep neural network was proposed to jointly consider text and audio embedding vectors, which resulted in accurate detections of unsafe and insulting situations.…”
Section: Introductionmentioning
confidence: 99%