Towards autonomous reinforcement learning: Automatic setting of hyper-parameters using Bayesian optimization

Barsce, Juan Cruz; Palombarini, Jorge A.; Martínez, Ernesto

doi:10.1109/clei.2017.8226439

Cited by 17 publications

(5 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…One popular way of automating algorithm selection is modeling the problem as MAB problem and assigning an action to each algorithm [26]. BO, which works well for automated machine learning, has been used for hyperparameters tuning for RL algorithms [27] and for adjusting weights of different objectives in the reward function [28]. In [29], hyper-parameters of the RL algorithm and the network structure are jointly optimized using the Genetic algorithm in which each individual is a DRL agent.…”

Section: B Automated Reinforcement Learning (Autorl)mentioning

confidence: 99%

An Automated Deep Reinforcement Learning Pipeline for Dynamic Pricing

Afshar

Rhuggenaath

Zhang

et al. 2023

IEEE Trans. Artif. Intell.

View full text Add to dashboard Cite

published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User

show abstract

Section: B Automated Reinforcement Learning (Autorl)mentioning

confidence: 99%

An Automated Deep Reinforcement Learning Pipeline for Dynamic Pricing

Afshar

Rhuggenaath

Zhang

et al. 2023

IEEE Trans. Artif. Intell.

View full text Add to dashboard Cite

show abstract

“…The algorithms focus on establishing a method to define the relationship between a set of variables (which represent the characteristics) and a continuous target variable. Examples of such algorithms being applied in self-driving systems include Bayesian regression [71], neural network regression [72] and decision forest regression [73].…”

Section: Regressionmentioning

confidence: 99%

Vulnerable Road Users and Connected Autonomous Vehicles Interaction: A Survey

Reyes-Muñoz

Ibáñez

2022

Sensors

View full text Add to dashboard Cite

There is a group of users within the vehicular traffic ecosystem known as Vulnerable Road Users (VRUs). VRUs include pedestrians, cyclists, motorcyclists, among others. On the other hand, connected autonomous vehicles (CAVs) are a set of technologies that combines, on the one hand, communication technologies to stay always ubiquitous connected, and on the other hand, automated technologies to assist or replace the human driver during the driving process. Autonomous vehicles are being visualized as a viable alternative to solve road accidents providing a general safe environment for all the users on the road specifically to the most vulnerable. One of the problems facing autonomous vehicles is to generate mechanisms that facilitate their integration not only within the mobility environment, but also into the road society in a safe and efficient way. In this paper, we analyze and discuss how this integration can take place, reviewing the work that has been developed in recent years in each of the stages of the vehicle-human interaction, analyzing the challenges of vulnerable users and proposing solutions that contribute to solving these challenges.

show abstract

“…aprendizado (0 < α ≤ 1) regula a velocidade em que as novas informações sobrepõem-se sobre o aprendizado já armazenado na matriz Q. Já o fator de desconto tem o papel de controlar a influência das recompensas futuras: se γ = 0, o reforço imediato tem grande influência; se 0 < γ ≤ 1, as recompensas futuras são descontadas; se γ = 1, as recompensas não são descontadas. Nesse aspecto, um desafio do ARé configurar os parâmetros de aprendizado de forma a maximizar o desempenho no domínio em estudo, pois , α e γ podem assumir diferentes combinações de valores (Schweighofer and Doya, 2003;Even-Dar and Mansour, 2003;Barsce et al, 2017;Ottoni et al, 2018).…”

Section: Aprendizado Por Reforçounclassified

“…De fato, um dos principais aspectos do ML e também do ARé a estimação de parâmetros, como taxa de aprendi-zado, fator de desconto, − greedy e função de reforço (Schweighofer and Doya, 2003;Even-Dar and Mansour, 2003;Barsce et al, 2017;Ottoni et al, 2018;Liessner et al, 2019;Hutter et al, 2019). Nesse aspecto, o desafioé propor métodos para a otimização e recomendação de parâmetros, de modo a otimizar o desempenho do aprendizado (Hutter et al, 2019).…”

Section: Introductionunclassified

Aplicação de Aprendizado por Reforço no Blackjack: Estudo de Caso de Estimação de Parâmetros

Junior

Ottoni

2020

Anais Do Congresso Brasileiro De Automática 2020

View full text Add to dashboard Cite

Este trabalho aplica a técnica de Aprendizado por Reforço (AR) no domínio do jogo de cartas Blackjack com o intuito de estimar os parâmetros de taxa de aprendizado (α) e fator de desconto (γ), de modo a maximizar o desempenho do algoritmo no jogo de cartas. São testadas 64 combinações de parâmetros e a definição da melhor combinação é obtida através da técnica estatística de Análise de Variância (ANOVA) e do uso do método Scott-Knott (SK). A combinação dos parâmetros estimada foi comparada com parâmetros adotados em outros trabalhos da literatura e obteve o melhor desempenho, com um número médio de vitórias e empates maior do que o número de derrotas.

show abstract

Towards autonomous reinforcement learning: Automatic setting of hyper-parameters using Bayesian optimization

Cited by 17 publications

References 12 publications

An Automated Deep Reinforcement Learning Pipeline for Dynamic Pricing

An Automated Deep Reinforcement Learning Pipeline for Dynamic Pricing

Vulnerable Road Users and Connected Autonomous Vehicles Interaction: A Survey

Aplicação de Aprendizado por Reforço no Blackjack: Estudo de Caso de Estimação de Parâmetros

Contact Info

Product

Resources

About