Introduction To date, most ways to perform facial expression recognition rely on two-dimensional images, advanced approaches with three-dimensional data exist. These however demand
stationary apparatuses and thus lack portability and possibilities to scale deployment. As human emotions, intent and even diseases may condense in distinct facial expressions or changes
therein, the need for a portable yet capable solution is signified. Due to the superior informative value of three-dimensional data on facial morphology and because certain syndromes find
expression in specific facial dysmorphisms, a solution should allow portable acquisition of true three-dimensional facial scans in real time. In this study we present a novel solution for
the three-dimensional acquisition of facial geometry data and the recognition of facial expressions from it. The new technology presented here only requires the use of a smartphone or tablet
with an integrated TrueDepth camera and enables real-time acquisition of the geometry and its categorization into distinct facial expressions.
Material and Methods Our approach consisted of two parts: First, training data was acquired by asking a collective of 226 medical students to adopt defined facial expressions while
their current facial morphology was captured by our specially developed app running on iPads, placed in front of the students. In total, the list of the facial expressions to be shown by the
participants consisted of “disappointed”, “stressed”, “happy”, “sad” and “surprised”. Second, the data were used to train a self-normalizing neural network. A set of all factors describing
the current facial expression at a time is referred to as “snapshot”.
Results In total, over half a million snapshots were recorded in the study. Ultimately, the network achieved an overall accuracy of 80.54% after 400 epochs of training. In test, an
overall accuracy of 81.15% was determined. Recall values differed by the category of a snapshot and ranged from 74.79% for “stressed” to 87.61% for “happy”. Precision showed similar results,
whereas “sad” achieved the lowest value at 77.48% and “surprised” the highest at 86.87%.
Conclusions With the present work it can be demonstrated that respectable results can be achieved even when using data sets with some challenges. Through various measures, already
incorporated into an optimized version of our app, it is to be expected that the training results can be significantly improved and made more precise in the future. Currently a follow-up
study with the new version of our app that encompasses the suggested alterations and adaptions, is being conducted. We aim to build a large and open database of facial scans not only for
facial expression recognition but to perform disease recognition and to monitor diseases’ treatment progresses.