Facial Expression Recognition using Residual Convnet with Image Augmentations

Rahadika, Fadhil Yusuf; Yudistira, Novanto; Sari, Yuita Arum

doi:10.21609/jiki.v14i2.968

Cited by 8 publications

(3 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, it can leverage the data augmentation process for data enrichment. data augmentation is a process using digital image processing, which changes images in such a way thus that transform those digital images as a new form of digital images [22], The benefit of data augmentation can also be seen in paper by [38] which shows that the use of augmentation has an effect on training outcomes of an ANN, by showing higher accuracy and lower loss values than those without data augmentation because data augmentation helps ANN recognize various patterns. There is much research in data augmentation methods to increase the performance of ANN.…”

Section: Data Augmentationmentioning

confidence: 99%

Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation

Jahja¹,

Yudistira²,

Sutrisno³

2022

Preprint

Self Cite

View full text Add to dashboard Cite

The COVID-19 pandemic has disrupted various levels of society. The use of masks is essential in preventing the spread of COVID-19 by identifying an image of a person using a mask. Although only 23.1% of people use masks correctly, Artificial Neural Networks (ANN) can help classify the use of good masks to help slow the spread of the Covid-19 virus. However, it requires a large dataset to train an ANN that can classify the use of masks correctly. MaskedFace-Net is a suitable dataset consisting of 137016 digital images with 4 class labels, namely Mask, Mask Chin, Mask Mouth Chin, and Mask Nose Mouth. Mask classification training utilizes Vision Transformers (ViT) architecture with transfer learning method using pre-trained weights on ImageNet-21k, with random augmentation. In addition, the hyper-parameters of training of 20 epochs, an Stochastic Gradient Descent (SGD) optimizer with a learning rate of 0.03, a batch size of 64, a Gaussian Cumulative Distribution (GeLU) activation function, and a Cross-Entropy loss function are used to be applied on the training of three architectures of ViT, namely Base-16, Large-16, and Huge-14. Furthermore, comparisons of with and without augmentation and transfer learning are conducted. This study found that the best classification is transfer learning and augmentation using ViT Huge-14. Using this method on MaskedFace-Net dataset, the research reaches an accuracy of 0.9601 on training data, 0.9412 on validation data, and 0.9534 on test data. This research shows that training the ViT model with data augmentation and transfer learning improves classification of the mask usage, even better than convolutional-based Residual Network (ResNet).

show abstract

Section: Data Augmentationmentioning

confidence: 99%

Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation

Jahja¹,

Yudistira²,

Sutrisno³

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Thus, the transfer learning method can be a solution to overcome the shortage of training data and improve CNN performance by utilizing the source domain to improve model performance in the target domain. CNN model using transfer learning methods has been carried out in several studies, such as image classification on human facial expressions [3], COVID-19 x-ray images [4] and food [5]. To this end, our research objective is development of deep learning system with MobileNet model using transfer learning to recognize exotic fruits images accurately.…”

Section: Introductionmentioning

confidence: 99%

Large Scale Image Classification of Exotic Fruits in Indonesia Using Transfer Learning Method with MobileNet Model

Prabandani¹,

Yudistira²,

Nisa³

2023

Advances in Economics, Business and Management Research

Self Cite

View full text Add to dashboard Cite

Exotic fruit is a fruit that is not widely known to the public. In Indonesia, there are many exotic fruits such as rambutan, passion fruit, mangosteen, longan, guava, and many more. Classification of exotic fruit images is needed because of the lack of knowledge from outsiders about exotic fruits in Indonesia. To his end, developing robust artificial intelligence using deep learning is necessary. CNN is the development of the Multilayer Perceptron (MLP) which is designed to process two-dimensional data and in the type of Deep Neural Network because of the high network depth and widely applied to image data. By utilizing the transfer learning method and a little fine-tuning, the efficient model like MobileNet expected to be better than without transfer learning in FruitNet model. Our contribution is applying efficient transfer learning MobileNet for Exotix Fruits in Indonesia which achieves 87% accuracy in average using more than 1000 images. The model performs better than previous model of FruitNet which only reaches 43% accuracy in average.

show abstract

“…Human Activity Recognition (HAR) is an introduction to human activities that refer to the movements performed by an individual on certain body parts. HAR has become a widely discussed scientific topic in the Computer Vision community because it is involved in many Human-Computer Interaction (HCI) application developments [1], [2]. One branch of HAR is human emotion.…”

Section: Introductionmentioning

confidence: 99%

Facial Expression Recognition Using Convolutional Neural Network with Attention Module

Khoirullah

Yudistira²,

Bachtiar³

2022

JOIV : Int. J. Inform. Visualization

View full text Add to dashboard Cite

Human Activity Recognition (HAR) is an introduction to human activities that refer to the movements performed by an individual on specific body parts. One branch of HAR is human emotion. Facial emotion is vital in human communication to help convey emotional states and intentions. Facial Expression Recognition (FER) is crucial to understanding how humans communicate. Misinterpreting Facial Expressions can lead to misunderstanding and difficulty reaching a common ground. Deep Learning can help in recognizing these facial expressions. To improve the probation of Facial Expressions Recognition, we propose ResNet attached with an Attention module to push the performance forward. This approach performs better than the standalone ResNet because the localization and sampling grid allows the model to learn how to perform spatial transformations on the input image. Consequently, it improves the model's geometric invariance and picks up the features of the expressions from the human face, resulting in better classification results. This study proves the proposed method with attention is better than without, with a test accuracy of 0.7789 on the FER dataset and 0.8327 on the FER+ dataset. It concludes that the Attention module is essential in recognizing Facial Expressions using a Convolutional Neural Network (CNN). Advice for further research first, add more datasets besides FER and FER+, and second, add a Scheduler to decrease the learning rate during the training data.

show abstract

Facial Expression Recognition using Residual Convnet with Image Augmentations

Cited by 8 publications

References 13 publications

Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation

Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation

Large Scale Image Classification of Exotic Fruits in Indonesia Using Transfer Learning Method with MobileNet Model

Facial Expression Recognition Using Convolutional Neural Network with Attention Module

Contact Info

Product

Resources

About