Construction robots are challenging the paradigm of labor‐intensive construction tasks. Imitation learning (IL) offers a promising approach, enabling robots to mimic expert actions. However, obtaining high‐quality expert demonstrations is a major bottleneck in this process as teleoperated robot motions may not align with optimal kinematic behavior. In this paper, two innovations have been proposed. First, traditional control using controllers has been replaced with vision‐based hand gesture control for intuitive demonstration collection. Second, a novel method that integrates both demonstrations and simple environmental rewards is proposed to strike a balance between imitation and exploration. To achieve this goal, a two‐step training process is proposed. In the first step, an intuitive demonstration collection platform using virtual reality is utilized. Second, a learning algorithm is used to train a policy for construction tasks. Experimental results demonstrate that combining IL with environmental rewards can significantly accelerate the training, even with limited demonstration data.