Human conversations are guided by short-term and long-term goals. We study how to plan short-term goal sequences as coherently as humans do and naturally direct them to an assigned long-term goal in open-domain conversations. Goal sequences are a series of knowledge graph (KG) entity-relation connections generated by KG walkers that traverse through the KG. The existing recurrent and graph attention based KG walkers either insufficiently utilize the conversation states or lack global guidance. In our work, a hierarchical model learns goal planning in a hierarchical learning framework. We present HiTKG, a hierarchical transformer-based graph walker that leverages multiscale inputs to make precise and flexible predictions on KG paths. Furthermore, we propose a two-hierarchy learning framework that employs two stages to learn both turn-level (short-term) and global-level (long-term) conversation goals. Specifically, at the first stage, HiTKG is trained in a supervised fashion to learn how to plan turn-level goal sequences; at the second stage, HiTKG tries to naturally approach the assigned global goal via reinforcement learning. In addition, we propose MetaPath as the backbone method for KG path representation to exploit the entity and relation information concurrently. We further propose Multi-source Decoding Inputs and Output-level Length Head to improve the decoding controllability. Our experiments show that HiTKG achieves a significant improvement in the performance of turn-level goal learning compared with state-of-the-art baselines. Additionally, both automatic and human evaluation prove the effectiveness of the two-hierarchy learning framework for both short-term and long-term goal planning.