An automatic approach of audio feature engineering for the extraction, analysis and selection of descriptors

In this work, the affective state of users in virtual learning environments is assessed/recognized in terms of continuous arousal and valence dimensions, making use of multimodal information (audio, text and video), whenever any of these modalities are available. In general, virtual learning environments where these three modalities are all the time, are not common; at some moments only the video modality is available, while in others only text or/and video and/or audio. Different approaches using feature-level fusion and decision-level fusion are proposed for multimodal recognition with missing data. Recognizing according to available modalities is studied following the ideas of dropout from neural networks and of variable input length from recurrent neural networks. This proposal is innovative because it represents emotions in the continuous space, which is not common in virtual education; and makes use of the available modalities in a virtual environment in a given moment, which is very common in virtual learning environments because the people are not speaking or writing all the time.

show abstract

“…3. This general procedure is based on the common practices observed in the literature [24,25,27,29,33,34,35].…”

Section: Our Approachmentioning

confidence: 99%

Analysis of different affective state multimodal recognition approaches with missing data-oriented to virtual learning environments

Salazar

Montoya-Múnera

Aguilar

2021

Heliyon

Self Cite

View full text Add to dashboard Cite

show abstract

“…Most research about occupancy is related to minimize the energy consumption in smart environments, so they use mainly atmospheric conditions data to measure the energy produced. Based on the previous work of Jimenez et al [1], this article proposes the research in occupancy and activity estimation for smart buildings using audio information. Works such as [2,3] use audio information from statistical theory and sound engineer, which include duration, frequency, loudness and sonority, among others, to extract useful information that can be interpreted.…”

Section: Introductionmentioning

confidence: 99%

“…To investigate the problem of descriptor extraction for sound content, in [1] Jimenez et al present the extraction of audio descriptors from the time series theory [6], that is, considering the audios as a set of time series, to use these time series characteristics as audio descriptors. The developed approach allows the analysis and selection of descriptors from a given audio context, with a hybrid scheme of extraction of those audio descriptors based on sound variables, descriptive statistics or time series.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Audio Feature Engineering for Occupancy and Activity Estimation in Smart Buildings

et al. 2021

Self Cite

View full text Add to dashboard Cite

The occupancy and activity estimation are fields that have been severally researched in the past few years. However, the different techniques used include a mixture of atmospheric features such as humidity and temperature, many devices such as cameras and audio sensors, or they are limited to speech recognition. In this work is proposed that the occupancy and activity can be estimated only from the audio information using an automatic approach of audio feature engineering to extract, analyze and select descriptors/variables. This scheme of extraction of audio descriptors is used to determine the occupation and activity in specific smart environments, such that our approach can differentiate between academic, administrative or commercial environments. Our approach from the audio feature engineering is compared to previous similar works on occupancy estimation and/or activity estimation in smart buildings (most of them including other features, such as atmospherics and visuals). In general, the results obtained are very encouraging compared to previous studies.

show abstract

Statistical study of surface texture and chip formation during turning of AISI 1020 steel: Emphasis on parameters Rsk, Rku, and Rk family and on the chip thickness ratio

Martins

Dumont

2022

Int J Adv Manuf Technol

View full text Add to dashboard Cite

An automatic approach of audio feature engineering for the extraction, analysis and selection of descriptors

Cited by 10 publications

References 18 publications

Analysis of different affective state multimodal recognition approaches with missing data-oriented to virtual learning environments

Analysis of different affective state multimodal recognition approaches with missing data-oriented to virtual learning environments

Audio Feature Engineering for Occupancy and Activity Estimation in Smart Buildings

Statistical study of surface texture and chip formation during turning of AISI 1020 steel: Emphasis on parameters Rsk, Rku, and Rk family and on the chip thickness ratio

Contact Info

Product

Resources

About