Gestures, a form of body language, significantly influence how users perceive humanoid robots. Recent data-driven methods for co-speech gestures have successfully enhanced the naturalness of the generated gestures. Moreover, compared to rule-based systems, these methods are more generalizable for unseen speech input. However, many of these methods cannot directly influence people’s perceptions of robots. The primary challenge lies in the intricacy of constructing a dataset with varied impression labels to develop a conditional generation model. In our prior work ([22]) Controlling the impression of robots via gan-based gesture generation. In:Proceedings of the international conference on intelligent robots and systems. IEEE, pp 9288-9295), we introduced a heuristic approach for automatic labeling, training a deep learning model to control robot impressions. We demonstrated the model’s effectiveness on both a virtual agent and a humanoid robot. In this study, we refined the motion retargeting algorithm for the humanoid robot and conducted a user study using four questions representing different aspects of extroversion. Our results show an improved capability in controlling the perceived degree of extroversion in the humanoid robot compared to previous methods. Furthermore, we discovered that different aspects of extroversion interact uniquely with motion statistics