Zhiqiang Sun scite author profile

As a research hotspot in the field of artificial intelligence, the application of deep reinforcement learning to the learning of the motion ability of a manipulator can help to improve the learning of the motion ability of a manipulator without a kinematic model. To suppress the overestimation bias of values in Deep Deterministic Policy Gradient (DDPG) networks, the Twin Delayed Deep Deterministic Policy Gradient (TD3) was proposed. This paper further suppresses the overestimation bias of values for multi-degree of freedom (DOF) manipulator learning based on deep reinforcement learning. Twin Delayed Deep Deterministic Policy Gradient with Rebirth Mechanism (RTD3) was proposed. The experimental results show that RTD3 applied to multi degree freedom manipulators is in place,with an improved learning ability by 29.15% on the basis of TD3. In this paper, a step-by-step reward function is proposed specifically for the learning and innovation of the multi degree of freedom manipulator’s motion ability. The view of continuous decision-making and process problem is used to guide the learning of the manipulator, and the learning efficiency is improved by optimizing the playback of experience. In order to measure the point-to-point position motion ability of a manipulator, a new evaluation index based on the characteristics of the continuous decision process problem, energy efficiency distance, is presented in this paper, which can evaluate the learning quality of the manipulator motion ability by a more comprehensive and fair evaluation algorithm.

show abstract

Bilinear matrix inequality approaches to robust guaranteed cost control for uncertain discrete‐time delay system

Nian

Sun

Wang

et al. 2012

Optim Control Appl Methods

View full text Add to dashboard Cite

SUMMARYThe robust guaranteed cost control problem for uncertain discrete‐time delay system is considered in this paper. Sufficient conditions for the existence of the robust guaranteed cost controllers via memoryless state feedback and static output feedback are expressed as bilinear matrix inequality (BMI). Furthermore, the design methods of optimal robust guaranteed cost controllers, which minimize the upper bound of a given quadratic cost function are presented. Alternate iterative algorithms are proposed to solve the nonconvex optimization problems with BMI constrains. A numerical example is given to illustrate the effectiveness of the proposed methods.Copyright © 2012 John Wiley & Sons, Ltd.

show abstract

Research of localization algorithm based on weighted Voronoi diagrams for wireless sensor network

Cai

Pan

Gao

et al. 2014

J Wireless Com Network

View full text Add to dashboard Cite

Wireless sensor network (WSN) is formed by a large number of cheap sensors, which are communicated by an ad hoc wireless network to collect information of sensed objects of a certain area. The acquired information is useful only when the locations of sensors and objects are known. Therefore, localization is one of the most important technologies of WSN. In this paper, weighted Voronoi diagram-based localization scheme (W-VBLS) is proposed to extend Voronoi diagram-based localization scheme (VBLS). In this scheme, firstly, a node estimates the distances according to the strength of its received signal strength indicator (RSSI) from neighbor beacons and divides three beacons into groups, whose distances are similar. Secondly, by a triangle, formed by the node and two beacons of a group, a weighted bisector can be calculated out. Thirdly, an estimated position of the node with the biggest RSSI value as weight can be calculated out by three bisectors of the same group. Finally, the position of the node is calculated out by the weighted average of all estimated positions. The simulation shows that compared with centroid and VBLS, W-VBLS has higher positioning accuracy and lower computation complexity.

show abstract

Persistent coverage of UAVs based on deep reinforcement learning with wonderful life utility

et al. 2023

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhiqiang Sun

Estimating Human Error Probability using a modified CREAM

The Control Method of Twin Delayed Deep Deterministic Policy Gradient with Rebirth Mechanism to Multi-DOF Manipulator

Bilinear matrix inequality approaches to robust guaranteed cost control for uncertain discrete‐time delay system

Research of localization algorithm based on weighted Voronoi diagrams for wireless sensor network

Persistent coverage of UAVs based on deep reinforcement learning with wonderful life utility

Contact Info

Product

Resources

About