Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning

Jin, Jun; Graves, Daniel; Haigh, Cameron; Luo, Jun; Jägersand, Martin

doi:10.1109/icra46639.2022.9811963

Cited by 4 publications

(3 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The advantage of this approach is that the control-rules are easy to explain to human operators, but since control is triggered by predictions that are continually updated in deployment the resultant controller adapts to changing conditions. An extension of this idea is to use GVF predictions-like the ones we learned in this work-as input to a neural-network based RL agent, similarly to how it was done for autonomous driving (Graves et al, 2020;Jin et al, 2022). This work provides the foundations for these next steps in industrial control with RL.…”

Section: Discussionmentioning

confidence: 99%

GVFs in the real world: making predictions online for water treatment

Janjua,

Shah,

White

et al. 2023

Mach Learn

View full text Add to dashboard Cite

In this paper we investigate the use of reinforcement-learning based prediction approaches for a real drinking-water treatment plant. Developing such a prediction system is a critical step on the path to optimizing and automating water treatment. Before that, there are many questions to answer about the predictability of the data, suitable neural network architectures, how to overcome partial observability and more. We first describe this dataset, and highlight challenges with seasonality, nonstationarity, partial observability, and heterogeneity across sensors and operation modes of the plant. We then describe General Value Function (GVF) predictions—discounted cumulative sums of observations–and highlight why they might be preferable to classical n-step predictions common in time series prediction. We discuss how to use offline data to appropriately pre-train our temporal difference learning (TD) agents that learn these GVF predictions, including how to select hyperparameters for online fine-tuning in deployment. We find that the TD-prediction agent obtains an overall lower normalized mean-squared error than the n-step prediction agent. Finally, we show the importance of learning in deployment, by comparing a TD agent trained purely offline with no online updating to a TD agent that learns online. This final result is one of the first to motivate the importance of adapting predictions in real-time, for non-stationary high-volume systems in the real world.

show abstract

Section: Discussionmentioning

confidence: 99%

GVFs in the real world: making predictions online for water treatment

Janjua,

Shah,

White

et al. 2023

Mach Learn

View full text Add to dashboard Cite

show abstract

“…Counterfactuals can be used to analyse the current situation and envisage the different outcomes by changing the antecedents to add to the knowledge. Counterfactual predictions are used by Jin et al (2022) to guide the exploration of a reinforcement learner for robotic manipulation tasks. The active counterfactual predictions generated add to the existing body of knowledge about the task for the learning model to improve its performance.…”

Section: Proposed Counterfactual Learning-based Approach For Resilien...mentioning

confidence: 99%

Counterfactual learning in enhancing resilience in autonomous agent systems

Samarasinghe

2023

Front. Artif. Intell.

View full text Add to dashboard Cite

Resilience in autonomous agent systems is about having the capacity to anticipate, respond to, adapt to, and recover from adverse and dynamic conditions in complex environments. It is associated with the intelligence possessed by the agents to preserve the functionality or to minimize the impact on functionality through a transformation, reconfiguration, or expansion performed across the system. Enhancing the resilience of systems could pave way toward higher autonomy allowing them to tackle intricate dynamic problems. The state-of-the-art systems have mostly focussed on improving the redundancy of the system, adopting decentralized control architectures, and utilizing distributed sensing capabilities. While machine learning approaches for efficient distribution and allocation of skills and tasks have enhanced the potential of these systems, they are still limited when presented with dynamic environments. To move beyond the current limitations, this paper advocates incorporating counterfactual learning models for agents to enable them with the ability to predict possible future conditions and adjust their behavior. Counterfactual learning is a topic that has recently been gaining attention as a model-agnostic and post-hoc technique to improve explainability in machine learning models. Using counterfactual causality can also help gain insights into unforeseen circumstances and make inferences about the probability of desired outcomes. We propose that this can be used in agent systems as a means to guide and prepare them to cope with unanticipated environmental conditions. This supplementary support for adaptation can enable the design of more intelligent and complex autonomous agent systems to address the multifaceted characteristics of real-world problem domains.

show abstract

“…At this juncture, it is exciting to look for the integration of artificial intelligence in process control as it can bring in a paradigm shift by enabling more intelligent, adaptive, and efficient control of industrial processes, ultimately contributing to increased productivity and sustainability . As such, a great deal of research is being done on model-free controlling techniques, particularly on model-free reinforcement learning (RL) algorithms. − In essence, a trained RL has the capabilities of generating a control policy by optimizing the cumulative reward signal over successive interactions with an environment until a desired goal is achieved. − …”

Section: Introductionmentioning

confidence: 99%

A Twin Agent Reinforcement Learning Framework by Integrating Deterministic and Stochastic Policies

Gupta,

Anand,

Kumar

et al. 2024

Ind. Eng. Chem. Res.

View full text Add to dashboard Cite

Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning

Cited by 4 publications

References 44 publications

GVFs in the real world: making predictions online for water treatment

GVFs in the real world: making predictions online for water treatment

Counterfactual learning in enhancing resilience in autonomous agent systems

A Twin Agent Reinforcement Learning Framework by Integrating Deterministic and Stochastic Policies

Contact Info

Product

Resources

About