On the analysis and design of software for reinforcement learning, with a survey of existing systems

Kovacs, Tim; Egginton, Robert

doi:10.1007/s10994-011-5237-8

Cited by 3 publications

(3 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…RL toolbox 5 , libpgrl 6 , YORLL 7 , and rllib 8 [8] are C++ based platforms to develop RL algorithms in different scenarios, while CLSquare 9 [9] is a standardized platform for testing RL problems with on-policy batch controllers. BURLAP 10 [7], PIQLE 11 [6], MMF 12 , QCON 13 , and RLPark 14 are Java platforms that model and learn from RL problems. MDP Toolbox 15 is an Octave based RL development platform.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning

Abeyruwan

Visser

2015

RoboCup 2015: Robot World Cup XIX

View full text Add to dashboard Cite

RLLib is a lightweight C++ template library that implements incremental, standard, and gradient temporal-difference learning algorithms in reinforcement learning. It is an optimized library for robotic applications and embedded devices that operates under fast duty cycles (e.g., ≤30 ms). RLLib has been tested and evaluated on RoboCup 3D soccer simulation agents, NAO V4 humanoid robots, and Tiva C series launchpad microcontrollers to predict, control, learn behavior, and represent learnable knowledge.

show abstract

Section: Related Workmentioning

confidence: 99%

“…RLLib closely follows the design principles and recommendations presented in [13,24]. The development of the library has taken significant efforts to minimize memory footprint as well as computational requirements that are requested by RL problems.…”

Section: Platformmentioning

confidence: 99%

RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning

Abeyruwan

Visser

2015

RoboCup 2015: Robot World Cup XIX

View full text Add to dashboard Cite

show abstract

“…We start this section with a reference to a more general approach of reinforcement learning (RL) which represents a broader class of planning problems where available domain knowledge is sufficient for only a partial formulation of the problem and the planning algorithm has to estimate the missing elements using simulation (Bertsekas and Tsitsiklis, 1996). A recent survey of the existing tools and software for engineering RL problem specifications presented in (Kovacs and Egginton, 2011) shows that engineering of RL domains is in most cases done either by re-implementation of required algorithms and domains/simulators or by partial re-use of the existing source code in the form of libraries or repositories, which means that engineering of RL domains is a direct implementation problem. There are no existing out of the box, domain independent environments where the specification of the RL problem would be reduced to the specification of the domain in a specific domain definition language.…”

Section: Knowledge Engineering For Planningmentioning

confidence: 99%

Relational approach to knowledge engineering for POMDP-based assistance systems as a translation of a psychological model

Grzes

Hoey

Khan

et al. 2014

International Journal of Approximate Reasoning

View full text Add to dashboard Cite

Assistive systems for persons with cognitive disabilities (e.g. dementia) are difficult to build due to the wide range of different approaches people can take to accomplishing the same task, and the significant uncertainties that arise from both the unpredictability of client's behaviours and from noise in sensor readings. Partially observable Markov decision process (POMDP) models have been used successfully as the reasoning engine behind such assistive systems for small multi-step tasks such as hand washing. POMDP models are a powerful, yet flexible framework for modelling assistance that can deal with uncertainty and utility. Unfortunately, POMDPs usually require a very labour intensive, manual procedure for their definition and construction. Our previous work has described a knowledge driven method for automatically generating POMDP activity recognition and context sensitive prompting systems for complex tasks. We call the resulting POMDP a SNAP (SyNdetic Assistance Process). The spreadsheet-like result of the analysis does not correspond to the POMDP model directly and the translation to a formal POMDP representation is required. To date, this translation had to be performed manually by a trained POMDP expert. In this paper, we formalise and automate this translation process using a probabilistic relational model (PRM) encoded in a relational database. The database encodes the relational skeleton of the PRM, and includes the goals, action preconditions, environment states, cognitive model, client and system actions (i.e., the outcome of the SNAP analysis), as well as relevant sensor models. The database is easy to approach for someone who is not an expert in POMDPs, allowing them to fill in the necessary details of a task using a simple and intuitive procedure. The database, when filled, implicitly defines a ground instance of the relational skeleton, which we extract using an automated procedure, thus generating a POMDP model of the assistance task. A strength of the database is that it allows constraints to be specified, such that we can verify the POMDP model is, indeed, valid for the task given the analysis. We demonstrate the method by eliciting three assistance tasks from non-experts: handwashing, and toothbrushing for elderly persons with dementia, and on a factory assembly task for persons with a cognitive disability. We validate the resulting POMDP models using case-based simulations to show that they are reasonable for the domains. We also show a complete case study of a designer specifying one database, including an evaluation in a real-life experiment with a human actor.

show abstract

Research on Android Application Reinforcement Method for Mobile Medical Service

Zhou

Zhang

Shu

et al. 2019

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

In the field of mobile medicine, in order to improve the security of Android applications and prevent Android APP from malicious decompilation and tampering, a security reinforcement scheme for Android APP based on random confusion is proposed. By confusing a large number of random characters in the DEX header file, without affecting the execution efficiency of Android software, it increases the difficulty of decompiling the DEX file, improves the security of the source code, and ensures the correctness and integrity of the DEX file.

show abstract

On the analysis and design of software for reinforcement learning, with a survey of existing systems

Cited by 3 publications

References 16 publications

RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning

RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning

Relational approach to knowledge engineering for POMDP-based assistance systems as a translation of a psychological model

Research on Android Application Reinforcement Method for Mobile Medical Service

Contact Info

Product

Resources

About