In recent years, deep reinforcement learning methods have achieved impressive performance in many different fields, including playing games, robotics, and dialogue systems. However, there are still a lot of restrictions here, one of which is the demand for massive amounts of sampled data. In this paper, a hierarchical meta-learning method based on the actor-critic algorithm is proposed for sample efficient learning. This method provides the transferable knowledge that can efficiently train an actor on a new task with a few trials. Specifically, a global basic critic, meta critic, and task specified network are shared within a distribution of tasks and are capable of criticizing any actor trying to solve any specified task. The hierarchical framework is applied to a critic network in the actor-critic algorithm for distilling meta-knowledge above the task level and addressing distinct tasks. The proposed method is evaluated on multiple classic control tasks with reinforcement learning algorithms, including the start-of-the-art metalearning methods. The experimental results statistically demonstrate that the proposed method achieves state-of-the-art performance and attains better results with more depth of meta critic network.INDEX TERMS Deep reinforcement learning, hierarchical framework, knowledge, meta-learning.
Most of the existing knowledge representation learning methods project the entities and relations represented by symbols in the knowledge graph into the low-dimensional vector space from the perspective of the structure and semantics of triples, and express the complex relations between entities and relations with dense low-dimensional vectors. However, triples in the knowledge graph not only contain relation triples, but also contain a large number of attribute triples. Existing knowledge representation methods often confuse these two kinds of triples and pay little attention to the semantic information contained in attributes and attribute values. In this paper, a novel representation learning method which makes use of the attribute information of entities is proposed. Specifically, deep convolutional neural network model is used to encode attribute information of entities, and both attribute information and triple structure information are utilized to learn knowledge representation, and then generate attribute-based representation of entities. The knowledge graph completion task was used to evaluate this method, and the experimental results on open data sets FB15K and FB24k showed that the attribute-embodied knowledge representation learning model outperforms the other baselines.INDEX TERMS Attribute, knowledge graph, representation learning.
multi-agent deep reinforcement learning (MDRL) is an emerging research hotspot and application direction in the field of machine learning and artificial intelligence. MDRL covers many algorithms, rules and frameworks, it is currently researched in swarm system, energy allocation optimization, stocking analysis, sequential social dilemma, and with extremely bright future. In this paper, a parallel-critic method based on classic MDRL algorithm MADDPG is proposed to alleviate the training instability problem in cooperative-competitive multi-agent environment. Furthermore, a policy smoothing technique is introduced to our proposed method to decrease the variance of learning policies. The suggested method is evaluated in three different scenarios of authoritative multi-agent particle environment (MPE). Multiple statistical data of experimental results show that our method significantly improves the training stability and performance compared to vanilla MADDPG.INDEX TERMS multi-agent system, deep reinforcement learning, parallel-critic architecture, training stability.
SUMMARYAs an important method to solve sequential decisionmaking problems, reinforcement learning learns the policy of tasks through the interaction with environment. But it has difficulties scaling to largescale problems. One of the reasons is the exploration and exploitation dilemma which may lead to inefficient learning. We present an approach that addresses this shortcoming by introducing qualitative knowledge into reinforcement learning using cloud control systems to represent 'if-then' rules. We use it as the heuristics exploration strategy to guide the action selection in deep reinforcement learning. Empirical evaluation results show that our approach can make significant improvement in the learning process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.