Loading and mobility of spin-labeled insulin in physiologically responsive complexation hydrogels intended for oral administration

We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in parameter space, which leads to lower variance gradient estimates than obtained by regular policy gradient methods. We show that for several complex control tasks, including robust standing with a humanoid robot, this method outperforms well-known algorithms from the fields of standard policy gradients, finite difference methods and population based heuristics. We also show that the improvement is largest when the parameter samples are drawn symmetrically. Lastly we analyse the importance of the individual components of our method by incrementally incorporating them into the other algorithms, and measuring the gain in performance after each step.

show abstract

Policy Gradients with Parameter-Based Exploration for Control

Sehnke

Osendorfer

Rückstieß

et al.

View full text Add to dashboard Cite

Abstract. We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in parameter space, which leads to lower variance gradient estimates than those obtained by policy gradient methods such as REINFORCE. For several complex control tasks, including robust standing with a humanoid robot, we show that our method outperforms well-known algorithms from the fields of policy gradients, finite difference methods and population based heuristics. We also provide a detailed analysis of the differences between our method and the other algorithms.

show abstract

Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer

Svoboda

Anoosheh²,

Osendorfer³

et al. 2020

View full text Add to dashboard Cite

Deep Iterative Surface Normal Estimation

Lenssen

Osendorfer²,

Masci³

2020

View full text Add to dashboard Cite

Sequential Feature Selection for Classification

Rückstieß

Osendorfer

Smagt

2011

View full text Add to dashboard Cite

Abstract. In most real-world information processing problems, data is not a free resource; its acquisition is rather time-consuming and/or expensive. We investigate how these two factors can be included in supervised classification tasks by deriving classification as a sequential decision process and making it accessible to Reinforcement Learning. Our method performs a sequential feature selection that learns which features are most informative at each timestep, choosing the next feature depending on the already selected features and the internal belief of the classifier. Experiments on a handwritten digits classification task show significant reduction in required data for correct classification, while a medical diabetes prediction task illustrates variable feature cost minimization as a further property of our algorithm.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Christian Osendorfer

Parameter-exploring policy gradients

Policy Gradients with Parameter-Based Exploration for Control

Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer

Deep Iterative Surface Normal Estimation

Sequential Feature Selection for Classification

Contact Info

Product

Resources

About