This paper proposes a framework for an Air Quality Decision Support System (AQDSS), and as a proof of concept, develops an Internet of Things (IoT) application based on this framework. This application was assessed by means of a case study in the City of Madrid. We employed different sensors and combined outdoor and indoor data with spatiotemporal activity patterns to estimate the Personal Air Pollution Exposure (PAPE) of an individual. This pilot case study presents evidence that PAPE can be estimated by employing indoor air quality monitors and e-beacon technology that have not previously been used in similar studies and have the advantages of being low-cost and unobtrusive to the individual. In future work, our IoT application can be extended to include prediction models, enabling dynamic feedback about PAPE risks. Furthermore, PAPE data from this type of application could be useful for air quality policy development as well as in epidemiological studies that explore the effects of air pollution on certain diseases.
Speech is one of the most natural communication channels for expressing human emotions. Therefore, speech emotion recognition (SER) has been an active area of research with an extensive range of applications that can be found in several domains, such as biomedical diagnostics in healthcare and human–machine interactions. Recent works in SER have been focused on end-to-end deep neural networks (DNNs). However, the scarcity of emotion-labeled speech datasets inhibits the full potential of training a deep network from scratch. In this paper, we propose new approaches for classifying emotions from speech by combining conventional mel-frequency cepstral coefficients (MFCCs) with image features extracted from spectrograms by a pretrained convolutional neural network (CNN). Unlike prior studies that employ end-to-end DNNs, our methods eliminate the resource-intensive network training process. By using the best prediction model obtained, we also build an SER application that predicts emotions in real time. Among the proposed methods, the hybrid feature set fed into a support vector machine (SVM) achieves an accuracy of 0.713 in a 6-class prediction problem evaluated on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, which is higher than the previously published results. Interestingly, MFCCs taken as unique input into a long short-term memory (LSTM) network achieve a slightly higher accuracy of 0.735. Our results reveal that the proposed approaches lead to an improvement in prediction accuracy. The empirical findings also demonstrate the effectiveness of using a pretrained CNN as an automatic feature extractor for the task of emotion prediction. Moreover, the success of the MFCC-LSTM model is evidence that, despite being conventional features, MFCCs can still outperform more sophisticated deep-learning feature sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.