Recent advances in combining deep neural network architectures with reinforcement learning techniques have shown promising potential results in solving complex control problems with high dimensional state and action spaces. Inspired by these successes, in this paper, we build two kinds of reinforcement learning algorithms: deep policy-gradient and value-function based agents which can predict the best possible traffic signal for a traffic intersection. At each time step, these adaptive traffic light control agents receive a snapshot of the current state of a graphical traffic simulator and produce control signals. The policy-gradient based agent maps its observation directly to the control signal, however the value-function based agent first estimates values for all legal control signals. The agent then selects the optimal control action with the highest value. Our methods show promising results in a traffic network simulated in the SUMO traffic simulator, without suffering from instability issues during the training process.
In recent years, a specific machine learning method called deep learning has gained huge attraction, as it has obtained astonishing results in broad applications such as pattern recognition, speech recognition, computer vision, and natural language processing. Recent research has also been shown that deep learning techniques can be combined with reinforcement learning methods to learn useful representations for the problems with high dimensional raw data input. This paper reviews the recent advances in deep reinforcement learning with a focus on the most used deep architectures such as autoencoders, convolutional neural networks and recurrent neural networks which have successfully been come together with the reinforcement learning framework.
The IEEE 1588 precision time protocol (PTP) is very important for many industrial sectors and applications that require time synchronization accuracy between computers down to microsecond and even nanosecond levels. Nevertheless, PTP and its underlying network infrastructure are vulnerable to cyber-attacks, which can stealthily reduce the time synchronization accuracy to unacceptable and even damage-causing levels for individual clocks or an entire network, leading to financial loss or even physical destruction. Existing security protocol extensions only partially address this problem. This paper provides a comprehensive analysis of strategies for advanced persistent threats to PTP infrastructure, possible attacker locations, and the impact on clock and network synchronization in the presence of security protocol extensions, infrastructure redundancy, and protocol redundancy. It distinguishes between attack strategies and attacker types as described in RFC7384, but further distinguishes between the spoofing and time source attack, the simple internal attack, and the advanced internal attack. Some experiments were conducted to demonstrate the impact of PTP attacks. Our analysis shows that a sophisticated attacker has a range of methodologies to compromise a PTP network. Moreover, all PTP infrastructure components can host an attacker, making the comprehensive protection of a PTP network against a malware infiltration, as for example exercised by Stuxnet, a very tedious task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.