Image captioning is an interesting and challenging task with applications in diverse domains such as image retrieval, organizing and locating images of users’ interest etc. It has huge potential for replacing manual caption generation for images and is especially suitable for large scale image data. Recently, deep neural network based methods have achieved great success in the field of computer vision, machine translation and language generation. In this paper, we propose an encoder-decoder based model that is capable of generating grammatically correct captions for images. This model makes use of VGG16 Hybrid Places 1365 as encoder and LSTM as decoder. To ensure the complete ground truth accuracy, the model is trained on the labelled Flickr8k and MSCOCO Captions datasets. Further, the model is evaluated using all standard metrics such as BLEU, METEOR, GLEU and ROUGE L. Experimental results indicate that the proposed model obtained a BLEU-1 score 0.6666, METEOR score 0.5060 and GLEU score 0.2469 on Flickr8k dataset and BLEU-1 score 0.7350, METEOR score 0.4768 and GLEU score 0.2798 on the MSCOCO Captions dataset. Thus, the proposed method achieved a significant performance as compared to the state-of-art approaches. To evaluate the efficacy of the model further, we also show the results of a caption generation from live sample images that reinforces the validity of the proposed approach.
The eruption of COVID-19 pandemic has led to the blossoming usage of face masks among individuals in the communal settings. To prevent the transmission of the virus, a mandatory mask-wearing rule in public areas has been enforced. Owing to the use of face masks in communities at different workplaces, an effective surveillance seems essential because several security analyses indicate that face masks may be used as a tool to hide the identity. Therefore, this work proposes a framework for the development of a smart surveillance system as an aftereffect of COVID-19 for recognition of individuals behind the face mask. For this purpose, transfer learning approach has been employed to train the custom dataset by YOLOv3 algorithm in the Darknet neural network framework. Moreover, to demonstrate the competence of YOLOv3 algorithm, a comparative analysis with YOLOv3-tiny has been presented. The simulated results verify the robustness of YOLOv3 algorithm in the recognition of individuals behind the face mask. Also, YOLOv3 algorithm achieves a mAP of 98.73% on custom dataset, outperforming YOLOv3-tiny by approximately 62%. Moreover, YOLOv3 algorithm provides adequate speed and accuracy on small faces.
Drivers undergo a lot of stress that might cause distraction and might lead to an unfortunate incident. Emotional recognition via facial expressions is one of the most important field in the human–machine interface. The goal of this paper is to analyze the drivers’ facial expressions in order to monitor their stress levels. In this paper, we propose FERNET — a hybrid deep convolutional neural network model for driver stress recognition through facial emotion recognition. FERNET is an integration of two DCNNs, pre-trained ResNet101V2 CNN and a custom CNN, ConvNet4. The experiments were carried out on the widely used public datasets CK[Formula: see text], FER2013 and AffectNet, achieving the accuracies of 99.70%, 74.86% and 70.46%, respectively, for facial emotion recognition. These results outperform the recent state-of-the-art methods. Furthermore, since a few specific isolated emotions lead to higher stress levels, we analyze the results for stress- and nonstress-related emotions for each individual dataset. FERNET achieves stress prediction accuracies of 98.17%, 90.16% and 84.49% for CK[Formula: see text], FER2013 and AffectNet datasets, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.