Since 2019 all countries of the world have faced the rapid spread of the pandemic caused by the COVID-19 coronavirus infection, the fight against which continues to the present day by the world community. Despite the obvious effectiveness of personal respiratory protection equipment against coronavirus infection, many people neglect the use of protective face masks in public places. Therefore, to control and timely identify violators of public health regulations, it is necessary to apply modern information technologies that will detect protective masks on people's faces using video and audio information. The article presents an analytical review of existing and developing intelligent information technologies for bimodal analysis of the voice and facial characteristics of a masked person. There are many studies on the topic of detecting masks from video images, and a significant number of cases containing images of faces both in and without masks obtained by various methods can also be found in the public access. Research and development aimed at detecting personal respiratory protection equipment by the acoustic characteristics of human speech is still quite small, since this direction began to develop only during the pandemic caused by the COVID-19 coronavirus infection. Existing systems allow to prevent the spread of coronavirus infection by recognizing the presence/absence of masks on the face, and these systems also help in remote diagnosis of COVID-19 by detecting the first symptoms of a viral infection by acoustic characteristics. However, to date, there is a number of unresolved problems in the field of automatic diagnosis of COVID-19 and the presence/absence of masks on people's faces. First of all, this is the low accuracy of detecting masks and coronavirus infection, which does not allow for performing automatic diagnosis without the presence of experts (medical personnel). Many systems are not able to operate in real time, which makes it impossible to control and monitor the wearing of protective masks in public places. Also, most of the existing systems cannot be built into a smartphone, so that users be able to diagnose the presence of coronavirus infection anywhere. Another major problem is the collection of data from patients infected with COVID-19, as many people do not agree to distribute confidential information.
The article presents an analytical review of research in the affective computing field. This research direction is a component of artificial intelligence, and it studies methods, algorithms and systems for analyzing human affective states during interactions with other people, computer systems or robots. In the field of data mining, the definition of affect means the manifestation of psychological reactions to an exciting event, which can occur both in the short and long term, and also have different intensity. The affects in this field are divided into 4 types: affective emotions, basic emotions, sentiment and affective disorders. The manifestation of affective states is reflected in verbal data and non-verbal characteristics of behavior: acoustic and linguistic characteristics of speech, facial expressions, gestures and postures of a person. The review provides a comparative analysis of the existing infoware for automatic recognition of a person’s affective states on the example of emotions, sentiment, aggression and depression. The few Russian-language, affective databases are still significantly inferior in volume and quality compared to electronic resources in other world languages. Thus, there is a need to consider a wide range of additional approaches, methods and algorithms used in a limited amount of training and testing data, and set the task of developing new approaches to data augmentation, transferring model learning and adapting foreign-language resources. The article describes the methods of analyzing unimodal visual, acoustic and linguistic information, as well as multimodal approaches for the affective states recognition. A multimodal approach to the automatic affective states analysis makes it possible to increase the accuracy of recognition of the phenomena compared to single-modal solutions. The review notes the trend of modern research that neural network methods are gradually replacing classical deterministic methods through better quality of state recognition and fast processing of large amount of data. The article discusses the methods for affective states analysis. The advantage of multitasking hierarchical approaches is the ability to extract new types of knowledge, including the influence, correlation and interaction of several affective states on each other, which potentially leads to improved recognition quality. The potential requirements for the developed systems for affective states analysis and the main directions of further research are given.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.