Abstract-The balance between exploration and exploitation is one of the key problems of action selection in Q-learning. Pure exploitation causes the agent to reach the locally optimal policies quickly, whereas excessive exploration degrades the performance of the Q-learning algorithm even if it may accelerate the learning process and allow avoiding the locally optimal policies. In this paper, finding the optimum policy in Q-learning is described as search for the optimum solution in combinatorial optimization. The Metropolis criterion of simulated annealing algorithm is introduced in order to balance exploration and exploitation of Q-learning, and the modified Q-learning algorithm based on this criterion, SA-Q-learning, is presented. Experiments show that SA-Q-learning converges more quickly than Q-learning or Boltzmann exploration, and that the search does not suffer of performance degradation due to excessive exploration.
Robots used in manufacturing today are tailored to their tasks by system integration based on expert knowledge concerning both production and machine control. For upcoming new generations of even more flexible robot solutions, in applications such as dexterous assembly, the robot setup and programming gets even more challenging. Reuse of solutions in terms of parameters, controls, process tuning, and of software modules in general then gets increasingly important.There has been valuable progress within reuse of automation solutions when machines comply with standards and behave according to nominal models. However, more flexible robots with sensor-based manipulation skills and cognitive functions for human interaction are far too complex to manage, and solutions are rarely reusable since knowledge is either implicit in imperative software or not captured in machine readable form.We propose techniques that build on existing knowledge by converting structured data into an RDF-based knowledge base. By enhancements of industrial control systems and available engineering tools, such knowledge can be gradually extended as part of the interaction during the definition of the robot task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.