Yaozong Pan scite author profile

Yaozong Pan

5Publications

20Citation Statements Received

78Citation Statements Given

How they've been cited

How they cite others

Affiliations

Space Engineering University

Publications

Order By: Most citations

A Novel Method for Improving the Training Efficiency of Deep Multi-Agent Reinforcement Learning

et al. 2019

View full text Add to dashboard Cite

Deep reinforcement learning (RL) holds considerable promise to help address a variety of multi-agent problems in a dynamic and complex environment. In multi-agent scenarios, most tasks require multiple agents to cooperate and the number of agents has a negative impact on the training efficiency of reinforcement learning. To this end, we propose a novel method, which uses the framework of centralized training and distributed execution and uses parameter sharing among homogeneous agents to replace partial calculation of network parameters in policy evolution. The parameter asynchronous sharing mechanism and the soft sharing mechanism are used to balance the exploratory of agents and the consistency of homogenous agents' policy. We experimentally validated our approach in different types of multi-agent scenarios. The empirical results show that our method can significantly promote training efficiency in collaborative tasks, competitive tasks, and mixed tasks without affecting the performance. INDEX TERMSMulti-agent, reinforcement learning, neural network, parameter sharing, MADDPG, training efficiency.

show abstract

Scalable Deep Multi-Agent Reinforcement Learning via Observation Embedding and Parameter Noise

et al. 2019

View full text Add to dashboard Cite

In this paper, we explore a scalable deep reinforcement learning (DRL) method for environments with multi-agents. Due to the explosive increase of the input dimensionality with the number of agents, most existing DRL methods are only able to cope with single-agent settings, or for only a small number of agents. To address this problem, we adopt a centralized training with decentralized execution framework, in which an observation embedding is used to reduce the curse of dimensionality. Perturbations are injected into the parameter space of the actor network for each agent to encourage exploration. The experiments demonstrate the effectiveness of the proposed approach on both cooperative and competitive environments. INDEX TERMS Artificial intelligence, multi-agent, deep reinforcement learning, deep deterministic policy gradient, actor-critic, centralized training with decentralized execution, observation embedding, parameter noise.

show abstract

Battlefield Target Aggregation Behavior Recognition Model Based on Multi-Scale Feature Fusion

et al. 2019

View full text Add to dashboard Cite

In this paper, our goal is to improve the recognition accuracy of battlefield target aggregation behavior while maintaining the low computational cost of spatio-temporal depth neural networks. To this end, we propose a novel 3D-CNN (3D Convolutional Neural Networks) model, which extends the idea of multi-scale feature fusion to the spatio-temporal domain, and enhances the feature extraction ability of the network by combining feature maps of different convolutional layers. In order to reduce the computational complexity of the network, we further improved the multi-fiber network, and finally established an architecture—3D convolution Two-Stream model based on multi-scale feature fusion. Extensive experimental results on the simulation data show that our network significantly boosts the efficiency of existing convolutional neural networks in the aggregation behavior recognition, achieving the most advanced performance on the dataset constructed in this paper.

show abstract

Research on Data Link Ontology Mapping Algorithm Based on Bayesian Network Model

2019

View full text Add to dashboard Cite

Interconnection between multiple data link systems is an urgent problem for wireless control systems. Its difficulty lies in the fact that data link messages are multi-source heterogeneous, By analyzing its sub-domain characteristics, we constructs the data message domain ontology and establishes a data link message ontology model based on Bayesian network(DLMOBN). It includes the study of nodes, directed edges and node similarity probability distribution and so on, convert multi-source heterogeneous messages into mathematical models. We propose a data link message ontology mapping algorithm, the OWL syntax is used to formally describe the acquired domain ontology, extract useful information such as concepts, attributes, and instances, and then store the information in a preset data structure, k-means algorithm is used to cluster them to form ''cluster'', which is used as a classification index to classify the similarity pair as a node in the Bayesian network, and pass the concept of the lower layer between nodes to the prior concept of the upper layer. The semantic distance, the attribute, the feature and other factors of the similar pair are used to calculate the semantic similarity. Finally, the final semantic similarity value is obtained by weighting. It is verified by experiments that the method improves the recall rate and precision, reduces the time complexity.

show abstract

Supervised Reinforcement Learning via Value Function

et al. 2019

View full text Add to dashboard Cite

Using expert samples to improve the performance of reinforcement learning (RL) algorithms has become one of the focuses of research nowadays. However, in different application scenarios, it is hard to guarantee both the quantity and quality of expert samples, which prohibits the practical application and performance of such algorithms. In this paper, a novel RL decision optimization method is proposed. The proposed method is capable of reducing the dependence on expert samples via incorporating the decision-making evaluation mechanism. By introducing supervised learning (SL), our method optimizes the decision making of the RL algorithm by using demonstrations or expert samples. Experiments are conducted in Pendulum and Puckworld scenarios to test the proposed method, and we use representative algorithms such as deep Q-network (DQN) and Double DQN (DDQN) as benchmarks. The results demonstrate that the method adopted in this paper can effectively improve the decision-making performance of agents even when the expert samples are not available.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yaozong Pan

A Novel Method for Improving the Training Efficiency of Deep Multi-Agent Reinforcement Learning

Scalable Deep Multi-Agent Reinforcement Learning via Observation Embedding and Parameter Noise

Battlefield Target Aggregation Behavior Recognition Model Based on Multi-Scale Feature Fusion

Research on Data Link Ontology Mapping Algorithm Based on Bayesian Network Model

Supervised Reinforcement Learning via Value Function

Contact Info

Product

Resources

About