Self-esteem is a significant kind of psychological resource, and behavioral self-esteem assessments are rare currently. Using ordinary cameras to capture one’s gait pattern to reveal people’s self-esteem meets the requirement for real-time population-based assessment. A total of 152 healthy students who had no walking issues were recruited as participants. The self-esteem scores and gait data were obtained using a standard 2D camera and the Rosenberg Self-Esteem Scale (RSES). After data preprocessing, dynamic gait features were extracted for training machine learning models that predicted self-esteem scores based on the data. For self-esteem prediction, the best results were achieved by Gaussian processes and linear regression, with a correlation of 0.51 (p < 0.001), 0.52 (p < 0.001), 0.46 (p < 0.001) for all participants, males, and females, respectively. Moreover, the highest reliability was 0.92 which was achieved by RBF-support vector regression. Gait acquired by a 2D camera can predict one’s self-esteem quite well. This innovative approach is a good supplement to the existing methods in ecological recognition of self-esteem leveraged by video-based gait.