What do we mean with sound semantics, exactly? A survey of taxonomies and ontologies of everyday sounds

Giordano, Bruno L.; Azevedo, Ricardo de Miranda; Plasencia-Calaña, Yenisel; Formisano, Elia; Dumontier, Michel

doi:10.3389/fpsyg.2022.964209

Cited by 7 publications

(7 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Nonetheless, Word2Vec outperformed these models, suggesting that natural sound semantics may not require complex contextual information for comprehension. This finding challenges the traditional view of natural sound perception’s semantic complexity, often examined through the lens of language semantics [47]. It suggests that the inherent characteristics of natural sounds, well-captured by Word2Vec’s relatively simple semantic map-ping, may not necessitate the contextual information demanded by language semantics.…”

Section: Discussionmentioning

confidence: 76%

Bridging Auditory Perception and Natural Language Processing with Semantically informed Deep Neural Networks

Esposito,

Valente,

Plasencia-Calaña

et al. 2024

Preprint

Self Cite

View full text Add to dashboard Cite

Sound recognition is effortless for humans but poses a significant challenge for artificial hearing systems. Deep neural networks (DNNs), especially convolutional neural networks (CNNs), have recently surpassed traditional machine learning in sound classification. However, current DNNs map sounds to labels using binary categorical variables, neglecting the semantic relations between labels. Cognitive neuroscience research suggests that human listeners exploit such semantic information besides acoustic cues. Hence, our hypothesis is that incorporating semantic information improves DNN s sound recognition performance, emulating human behavior. In our approach, sound recognition is framed as a regression problem, with CNNs trained to map spectrograms to continuous semantic representations from NLP models (Word2Vec, BERT, and CLAP text encoder). Two DNN types were trained: semDNN with continuous embeddings and catDNN with categorical labels, both with a dataset extracted from a collection of 388,211 sounds enriched with semantic descriptions. Evaluations across four external datasets, confirmed the superiority of semantic labeling from semDNN compared to catDNN, preserving higher-level relations. Importantly, an analysis of human similarity ratings for natural sounds, showed that semDNN approximated human listener behavior better than catDNN, other DNNs, and NLP models. Our work contributes to understanding the role of semantics in sound recognition, bridging the gap between artificial systems and human auditory perception.

show abstract

Section: Discussionmentioning

confidence: 76%

Bridging Auditory Perception and Natural Language Processing with Semantically informed Deep Neural Networks

Esposito,

Valente,

Plasencia-Calaña

et al. 2024

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Here, we considered word2vec embeddings of the linguistic sound descriptions to retain information on the (linguistic) semantic relations between the sound sources. In the future, the same approach could be extended to different types of semantic embeddings, for example, derived from natural sound ontologies [26].…”

Section: Discussionmentioning

confidence: 99%

Semantically-Informed Deep Neural Networks For Sound Recognition

Esposito

Valente

Plasencia-Calaña

et al. 2023

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

Deep neural networks (DNNs) for sound recognition learn to categorize a barking sound as a "dog" and a meowing sound as a "cat" but do not exploit information inherent to the semantic relations between classes (e.g., both are animal vocalisations). Cognitive neuroscience research, however, suggests that human listeners automatically exploit higher-level semantic information on the sources besides acoustic information. Inspired by this notion, we introduce here a DNN that learns to recognize sounds and simultaneously learns the semantic relation between the sources (semDNN). Comparison of semDNN with a homologous network trained with categorical labels (catDNN) revealed that semDNN produces semantically more accurate labelling than catDNN in sound recognition tasks and that semDNN-embeddings preserve higherlevel semantic relations between sound sources. Importantly, through a model-based analysis of human dissimilarity ratings of natural sounds, we show that semDNN approximates the behaviour of human listeners better than catDNN and several other DNN and NLP comparison models.

show abstract

“…New representational models are required to structure data in various types and forms (Yao et al, 2019). The conceptual frameworks for the description of sounds range from simple sound classifications, to more complex cases such as taxonomies, to highly complex ontologies (Giordano et al, 2022). Taxonomies are arranged in nested, cumulative hierarchies, extending to certain depths (Giordano et al, 2022).…”

Section: Conceptual and Ontological Framework For Soundsmentioning

confidence: 99%

“…The conceptual frameworks for the description of sounds range from simple sound classifications, to more complex cases such as taxonomies, to highly complex ontologies (Giordano et al, 2022). Taxonomies are arranged in nested, cumulative hierarchies, extending to certain depths (Giordano et al, 2022). An ontology is a specification of a representational vocabulary for a shared domain of dis-course and defines classes, relations, functions, and other objects (Gruber, 1993).…”

Section: Conceptual and Ontological Framework For Soundsmentioning

confidence: 99%

“…Several ontologies have been proposed to represent geospatial data (e.g., Fonseca et al (2000); Couclelis (2019); Daneshfar et al (2022)) also in the field of navigation and wayfinding (Letalle et al, 2020;Sarjakoski et al, 2013;Timpf, 2002;Wang and Issa, 2020). Ontologies and taxonomies for the characterisation of everyday sounds have been developed in several research fields, including auditory cognition, soundscape research, and artificial hearing (Giordano et al, 2022). Gaver (1993) introduced a theoretical framework in the early 1990's that has been very influential for subsequent auditory cognitive research on real-world sound perception (Giordano et al, 2022).…”

Section: Conceptual and Ontological Framework For Soundsmentioning

confidence: 99%

See 1 more Smart Citation

Urban Sound Mapping for Wayfinding – A theoretical Approach and an empirical Study

Nuhn

Hamburger

Timpf

2023

AGILE GIScience Ser.

View full text Add to dashboard Cite

Abstract. Conventional navigation systems use visually perceptible landmarks to navigate their users from a starting point to a destination. However, sometimes visual information is not enough for route guidance. Visually-impaired or elderly people may not be able to navigate using the visual sense. Furthermore, there may exist no outstanding (i.e., salient) visual landmarks that could be used to navigate. In such a case auditory information may be a helpful guide. We performed two online studies and a focus-group interview to identify possible sound classes in an urban environment. Based on our results, we gathered sounds in Augsburg and classified them according to their source. The findings support our notion that auditory information can be useful for spatial orientation and guidance in addition to or even replacing visual information.

show abstract

What do we mean with sound semantics, exactly? A survey of taxonomies and ontologies of everyday sounds

Cited by 7 publications

References 34 publications

Bridging Auditory Perception and Natural Language Processing with Semantically informed Deep Neural Networks

Bridging Auditory Perception and Natural Language Processing with Semantically informed Deep Neural Networks

Semantically-Informed Deep Neural Networks For Sound Recognition

Urban Sound Mapping for Wayfinding – A theoretical Approach and an empirical Study

Contact Info

Product

Resources

About