With an increasing number of robots are employed in manufacturing, a human-robot interaction method that can teach robots in a natural, accurate, and rapid manner is needed. In this paper, we propose a novel human-robot interface based on the combination of static hand gestures and hand poses. In our proposed interface, the pointing direction of the index finger and the orientation of the whole hand are extracted to indicate the moving direction and orientation of the robot in a fast-teaching mode. A set of hand gestures are designed according to their usage in humans' daily life and recognized to control the position and orientation of the robot in a fine-teaching mode. We employ the feature extraction ability of the hand pose estimation network via transfer learning and utilize attention mechanisms to improve the performance of the hand gesture recognition network. The inputs of hand pose estimation and hand gesture recognition networks are monocular RGB images, making our method independent of depth information input and applicable to more scenarios. In the regular shape reconstruction experiments on the UR3 robot, the mean error of the reconstructed shape is less than 1 mm, which demonstrates the effectiveness and efficiency of our method.