Energy efficiency is the major concern in hierarchical wireless sensor networks(WSNs), where the major energy consumption originates from radios for communication. Due to notable energy expenditure of long-range transmission for cluster members and data aggregation for Cluster Head (CH), saving and balancing energy consumption is a tricky challenge in WSNs. In this paper, we design a CH selection mechanism with a mobile sink (MS) while proposing relay selection algorithms with multiuser multi-armed bandit (UM-MAB) to solve the problem of energy efficiency. According to the definition of node density and residual energy, we propose a conception referred to as a Virtual Head (VH) for MS to collect data in terms of energy efficiency. Moreover, we naturally change the relay selection problem into permutation problem through employing the two-hop transmission in cooperative power line communication, which deals with long-distance transmission. As far as the relay selection problem is concerned, we propose the machine learning algorithm, namely MU-MAB, to solve it through the reward associated with an increment for energy consumption. Furthermore, we employ the stable matching theory based on marginal utility for the allocation of the final one-to-one optimal combinations to achieve energy efficiency. In order to evaluate MU-MAB, the regret is taken advantage to demonstrate the performance by using upper confidence bound (UCB) index. In the end, simulation results illustrate the efficacy and effectiveness of our proposed solutions for saving and balancing energy consumption.