Uncertainty-based Out-of-Distribution Classification in Deep Reinforcement Learning

Sedlmeier, Andreas; Gabor, Thomas; Phan, Thomy; Belzner, Lenz; Linnhoff‐Popien, Claudia

doi:10.5220/0008949905220529

Cited by 11 publications

(5 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other related topics could also provide inspiration for how to learn rules of behavior that generalize to novel situations due to changing sets of agents and tasks in OASYS. These include (1) out-of-distribution learning (e.g., Sedlmeier et al 2020;Haider et al 2023), where agents detect that their current tasks are different from those experienced during training and must adapt their behavior to new situations, (2) lifelong learning (e.g., Thrun and Mitchell 1995;Ammar et al 2014;Chen and Liu 2018;Mendez, van Seijen, and Eaton 2022) where agents learn how to complete future tasks based on knowledge gained from previously learned tasks, and (3) multitask learning (e.g., Tanaka and Yamamura 2003;Andreas, Klein, and Levine 2017;Rajeswaran et al 2017;Sodhani, Zhang, and Pineau 2021) where agents learn how to generalize to complete a given set of tasks, potentially exploiting task similarities and differences to quickly improve performance on the tasks. Recently, Zhang et al (2023) have also studied decision making through multiagent RL when other agents policies abruptly change during operations, which could be useful for guiding RL under task and type openness.…”

Section: Reinforcement Learning In Oasysmentioning

confidence: 99%

See 1 more Smart Citation

Decision making in open agent systems

Eck,

Soh,

Doshi

2023

AI Magazine

View full text Add to dashboard Cite

In many real‐world applications of AI, the set of actors and tasks are not constant, but instead change over time. Robots tasked with suppressing wildfires eventually run out of limited suppressant resources and need to temporarily disengage from the collaborative work in order to recharge, or they might become damaged and leave the environment permanently. In a large business organization, objectives and goals change with the market, requiring workers to adapt to perform different sets of tasks across time. We call these multiagent systems (MAS) open agent systems (OASYS), and the openness of the sets of agents and tasks necessitates new capabilities and modeling for decision making compared to planning and learning in closed environments. In this article, we discuss three notions of openness: agent openness, task openness, and type openness. We also review the past and current research on addressing the novel challenges brought about by openness in OASYS. We share lessons learned from these efforts and suggest directions for promising future work in this area. We also encourage the community to engage and participate in this area of MAS research to address critical real‐world problems in the application of AI to enhance our daily lives.

show abstract

Section: Reinforcement Learning In Oasysmentioning

confidence: 99%

“…These include (1) out‐of‐distribution learning (e.g., Sedlmeier et al. 2020; Haider et al. 2023), where agents detect that their current tasks are different from those experienced during training and must adapt their behavior to new situations, (2) lifelong learning (e.g., Thrun and Mitchell 1995; Ammar et al.…”

Section: Decision Making In Oasysmentioning

confidence: 99%

Decision making in open agent systems

Eck,

Soh,

Doshi

2023

AI Magazine

View full text Add to dashboard Cite

show abstract

“…The differences between scenarios and data sets will change the relative performance of the methods [63,64] 11 pre-trains a model on OOD auxiliary outputs and fine-tunes this model with the pseudolabels [65] 12 Nash equilibria of these games are closer to the ideal OOD solutions than the standard empirical risk minimization (ERM) [66] 13 Interval bound propagation (IBP) is used to upper bound the maximal confidence in the l∞-ball and minimize this upper bound during training time [67] 14…”

Section: Numbermentioning

confidence: 99%

Out-of-Distribution (OOD) Detection Based on Deep Learning: A Review

Cui

Wang

2022

Electronics

View full text Add to dashboard Cite

Out-of-Distribution (OOD) detection separates ID (In-Distribution) data and OOD data from input data through a model. This problem has attracted increasing attention in the area of machine learning. OOD detection has achieved good intrusion detection, fraud detection, system health monitoring, sensor network event detection, and ecosystem interference detection. The method based on deep learning is the most studied in OOD detection. In this paper, related basic information on OOD detection based on deep learning is described, and we categorize methods according to the training data. OOD detection is divided into supervised, semisupervised, and unsupervised. Where supervised data are used, the methods are categorized according to technical means: model-based, distance-based, and density-based. Each classification is introduced with background, examples, and applications. In addition, we present the latest applications of OOD detection based on deep learning and the problems and expectations in this field.

show abstract

“…PEOC (Sedlmeier et al, 2020b) for example uses the policy entropy of an RL agent trained using policy-gradient methods, to detect increased epistemic uncertainty in untrained situations. UBOOD (Sedlmeier et al, 2020a) by contrast is applicable to value-based RL settings and is based on the reducibility of an agent's epistemic uncertainty in it's Q-Value function. Although the methods differentiate between aleatoric and epistemic uncertainty to detect OOD situations, multimodality is not a focus.…”

Section: Uncertainty-based Ood Detectionmentioning

confidence: 99%

Quantifying Multimodality in World Models

Sedlmeier¹,

Kölle²,

Müller³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Model-based Deep Reinforcement Learning (RL) assumes the availability of a model of an environment's underlying transition dynamics. This model can be used to predict future effects of an agent's possible actions. When no such model is available, it is possible to learn an approximation of the real environment, e.g. by using generative neural networks, sometimes also called World Models. As most real-world environments are stochastic in nature and the transition dynamics are oftentimes multimodal, it is important to use a modelling technique that is able to reflect this multimodal uncertainty. In order to safely deploy such learning systems in the real world, especially in an industrial context, it is paramount to consider these uncertainties. In this work, we analyze existing and propose new metrics for the detection and quantification of multimodal uncertainty in RL based World Models. The correct modelling & detection of uncertain future states lays the foundation for handling critical situations in a safe way, which is a prerequisite for deploying RL systems in real-world settings.

show abstract

Uncertainty-based Out-of-Distribution Classification in Deep Reinforcement Learning

Cited by 11 publications

References 17 publications

Decision making in open agent systems

Decision making in open agent systems

Out-of-Distribution (OOD) Detection Based on Deep Learning: A Review

Quantifying Multimodality in World Models

Contact Info

Product

Resources

About