In this paper, an approach that can significantly increase the accuracy of facial emotion recognition by adapting the model to the emotions of a particular user (e.g., smartphone owner) is considered. At the first stage, a neural network model, which was previously trained to recognize facial expressions in static photos, is used to extract visual features of faces in each frame. Next, the face features of video frames are aggregated into a single descriptor for a short video fragment. After that a neural network classifier is trained. At the second stage, it is proposed that adaptation (fine-tuning) to this classifier should be performed using a small set of video data with the facial expressions of a particular user. After emotion classification, the user can adjust the predicted emotions to further improve the accuracy of a personal model. As part of an experimental study for the RAVDESS dataset, it has been shown that the approach with model adaptation to a specific user can significantly (up to 20 – 50 %) improve the accuracy of facial expression recognition in the video.