This paper analyzes and researches the network attack in the electric power information environment. The intrusion attack steps are examined, and the Bayesian inference method is applied to investigate the attack source information network delivery. The success probability of the network attack is quantified by likelihood. Noisy Net, Dueling DQN, Soft Q-learning, Prioritized Experience Playback Mechanism, and ICM model are integrated to improve the DQN algorithm from different perspectives. A NDSPI-DQN algorithm is proposed based on Bayesian inference. The experimental results show that comparing the convergence performance of DQN, PPO, and this paper’s algorithm, both this paper’s algorithm and the PPO algorithm can converge to the maximum cumulative reward value within 1000 rounds, and this paper’s algorithm can converge to the optimal value within 350 rounds. In an environment with 120 hosts, the optimal path discovery success rate of this paper’s algorithm is 97.23%. The optimal number of iterations and average running time are 1.12 times and 3.81 seconds. The proposed method is suitable for large-scale power information networks with higher execution efficiency.