The increase in the number of services in the power distribution grid leads to a massive increase in task data. Power distribution internet of things (PDIoT) is the specific application of internet of things (IoT) in the power distribution grid. By deploying a large number of PDIoT devices, the voltage, active power, reactive power, and harmonic parameters are collected to support distribution grid services such as fault identification and status detection. Therefore, PDIoT utilizes massive devices to collect and offload tasks to the edge server through 5G network for real-time data processing. However, how to dynamically select edge servers and channels to meet the energy-efficient and low-latency task offloading requirements of PDIoT devices still faces several technical challenges such as task offloading decisions coupling among devices, unobtainable global state information, as well as interrelation of various quality of service (QoS) metrics such as energy efficiency and delay. To this end, we firstly construct a joint optimization problem to maximize the weighted difference between energy efficiency and delay of devices in PDIoT. Then, the joint optimization problem is decomposed into a large-timescale server selection problem and a small-timescale channel selection problem. Next, we propose an ML-based two-stage task offloading algorithm, where the large-timescale problem is solved by two-side matching in the first stage, and the small-timescale problem is solved by adaptive
ε
-greedy learning in the second stage. Finally, simulation results show that compared with the task offloading delay-first matching algorithm and the matching theory-based task offloading strategy, the proposed algorithm performs superior in terms of energy efficiency and delay.