A robot is capable of mimicking human beings, including recognizing their faces and emotions. However, current studies of the humanoid robot have not been implemented in the real-time system. In addition, face recognition and emotion recognition have been treated as separate problems. Thus, for real-time application on a humanoid robot, this study proposed a combination of face recognition and emotion recognition. Face and emotion recognition systems were developed concurrently in this study using convolutional neural network architectures. The proposed architecture was compared to the well-known architecture, AlexNet, to determine which architecture would be better suited for implementation on a humanoid robot. Primary data from 30 respondents was used for face recognition. Meanwhile, emotional data were collected from the same respondents and combined with secondary data from a 2500-person dataset. Surprise, anger, neutral, smile, and sadness were among the emotions. The experiment was carried out in real-time on a humanoid robot using the two architectures. Using the AlexNet model, the accuracy of face and emotion recognition was 87 % and 70 %, respectively. Meanwhile, the proposed architecture achieved accuracy rates of 95 % for face recognition and 75 % for emotion recognition, respectively. Thus, the proposed method performs better in terms of recognizing faces and emotions, and it can be implemented on a humanoid robot.