The Industrial Internet of Things has grown significantly in recent years. While implementing industrial digitalization, automation, and intelligence introduced a slew of cyber risks, the complex and varied industrial Internet of Things environment provided a new attack surface for network attackers. As a result, conventional intrusion detection technology cannot satisfy the network threat discovery requirements in today’s Industrial Internet of Things environment. In this research, the authors have used reinforcement learning rather than supervised and unsupervised learning, because it could very well improve the decision-making ability of the learning process by integrating abstract thinking of complete understanding, using deep knowledge to perform simple and nonlinear transformations of large-scale original input data into higher-level abstract expressions, and using learning algorithm or learning based on feedback signals, in the lack of guiding knowledge, which is based on the trial-and-error learning model, from the interaction with the environment to find the best good solution. In this respect, this article presents a near-end strategy optimization method for the Industrial Internet of Things intrusion detection system based on a deep reinforcement learning algorithm. This method combines deep learning’s observation capability with reinforcement learning’s decision-making capability to enable efficient detection of different kinds of cyberassaults on the Industrial Internet of Things. In this manuscript, the DRL-IDS intrusion detection system is built on a feature selection method based on LightGBM, which efficiently selects the most attractive feature set from industrial Internet of Things data; when paired with deep learning algorithms, it effectively detects intrusions. To begin, the application is based on GBM’s feature selection algorithm, which extracts the most compelling feature set from Industrial Internet of Things data; then, in conjunction with the deep learning algorithm, the hidden layer of the multilayer perception network is used as the shared network structure for the value network and strategic network in the PPO2 algorithm; and finally, the intrusion detection model is constructed using the PPO2 algorithm and ReLU (R). Numerous tests conducted on a publicly available data set of the Industrial Internet of Things demonstrate that the suggested intrusion detection system detects 99 percent of different kinds of network assaults on the Industrial Internet of Things. Additionally, the accuracy rate is 0.9%. The accuracy, precision, recall rate, F1 score, and other performance indicators are superior to those of the existing intrusion detection system, which is based on deep learning models such as LSTM, CNN, and RNN, as well as deep reinforcement learning models such as DDQN and DQN.