Advice-Based Exploration in Model-Based Reinforcement Learning

Icarte, Rodrigo Toro; Klassen, Toryn Q.; Valenzano, Richard; McIlraith, Sheila A.

doi:10.1007/978-3-319-89656-4_6

Cited by 14 publications

(16 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Figure 11(c) shows the average and the maximum number of steps required to terminate for all the engines with every specification across 100 executions in logarithmic scale. The number of steps is a known measure used to compare RL methods logically constrained with LTL formulae [40‐43]. Known RL‐LTL methods take a high number of steps, in the order of hundreds of thousands, because these methods aim to converge to an optimal policy.…”

Section: Discussionmentioning

confidence: 99%

“…Several studies [40‐43] use LTL specifications as a high‐level guide for an RL agent. The RL agent in these studies never terminate and has to avoid violating a given specification indefinitely.…”

Section: Related Workmentioning

confidence: 99%

“…Icarte et al . [42] and RV‐Droid [50] use LTL progression. As an alternative, RV‐Android [51] first translates an LTL specification into a past time linear temporal logic formula and monitors past actions using that formula.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Functional test generation from UI test scenarios using reinforcement learning for android applications

Koroglu

Şen

2020

Software Testing Verif & Rel

View full text Add to dashboard Cite

Summary With the ever‐growing Android graphical user interface (GUI) application market, there have been many studies on automated test generation for Android GUI applications. These studies successfully demonstrate how to detect fatal exceptions and achieve high coverage with fully automated test generation engines. However, it is unclear how many GUI functions these engines manage to test. The current best practice for the functional testing of Android GUI applications is to design user interface (UI) test scenarios with a non‐technical and human‐readable language such as Gherkin and implement Java/Kotlin methods for every statement of all the UI test scenarios. Writing tests for UI test scenarios is hard, especially when some scenario statements are high‐level and declarative, so it is not clear what actions should the generated test perform. We propose the Fully Automated Reinforcement LEArning‐Driven specification‐based test generator for Android (FARLEAD‐Android). FARLEAD‐Android first translates the UI test scenario to a GUI‐level formal specification as a linear‐time temporal logic (LTL) formula. The LTL formula guides the test generation and acts as a specified test oracle. By dynamically executing the application under test (AUT), and monitoring the LTL formula, FARLEAD‐Android learns how to produce a witness for the UI test scenario, using reinforcement learning (RL). Our evaluation shows that FARLEAD‐Android is more effective and achieves higher performance in generating tests for UI test scenarios than three known engines: Random, Monkey and QBEa. To the best of our knowledge, FARLEAD‐Android is the first fully automated mobile GUI testing engine that uses formal specifications.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Functional test generation from UI test scenarios using reinforcement learning for android applications

Koroglu

Şen

2020

Software Testing Verif & Rel

View full text Add to dashboard Cite

show abstract

“…where ρ ∈ AP is an atomic predicate; ¬ (negation), ∧ (conjunction), ∨ (disjunction) are Boolean connectives; ♦ (eventually) and (always) are temporal operators; and I is a bounded interval of the form I = [i 1 , i 2 ] (i 1 < i 2 , i 1 , i 2 ∈ T). For example, the MITL f formula [2,5] (x > 3) reads as "x is always greater than 3 during the time interval [2,5]". A timed word generated by a trajectory s 0:L is defined as a sequence (L(s t1 ), t 1 ), .…”

Section: Metric Interval Temporal Logicmentioning

confidence: 99%

“…The sampling efficiency and performance of RL can be improved if some high-level knowledge can be incorporated in the learning process [2]. Such knowledge can be also transferred from a source task to a target task if these tasks are logically similar [3].…”

Section: Introductionmentioning

confidence: 99%

Transfer of Temporal Logic Formulas in Reinforcement Learning

Topcu

2019

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL). For example, propositional logic and first-order logic have been used as representations of such knowledge. We study the transfer of knowledge between tasks in which the timing of the events matters. We call such tasks temporal tasks. We concretize similarity between temporal tasks through a notion of logical transferability, and develop a transfer learning approach between different yet similar temporal tasks. We first propose an inference technique to extract metric interval temporal logic (MITL) formulas in sequential disjunctive normal form from labeled trajectories collected in RL of the two tasks. If logical transferability is identified through this inference, we construct a timed automaton for each sequential conjunctive subformula of the inferred MITL formulas from both tasks. We perform RL on the extended state which includes the locations and clock valuations of the timed automata for the source task. We then establish mappings between the corresponding components (clocks, locations, etc.) of the timed automata from the two tasks, and transfer the extended Q-functions based on the established mappings. Finally, we perform RL on the extended state for the target task, starting with the transferred extended Q-functions. Our results in two case studies show, depending on how similar the source task and the target task are, that the sampling efficiency for the target task can be improved by up to one order of magnitude by performing RL in the extended state space, and further improved by up to another order of magnitude using the transferred extended Q-functions. IntroductionReinforcement learning (RL) has been successful in numerous applications. In practice though, it often requires extensive exploration of the environment to achieve satisfactory performance, especially for complex tasks with sparse rewards [1].The sampling efficiency and performance of RL can be improved if some high-level knowledge can be incorporated in the learning process [2]. Such knowledge can be also transferred from a source task to a target task if these tasks are logically similar [3]. For example, propositional logic and first-order logic have been used as representations of knowledge in the form of logical structures for transfer learning [4]. They showed that incorporating such logical similarities can expedite RL for the target task [5].The transfer of high-level knowledge can be also applied to tasks where the timing of the events matters. We call such tasks as temporal tasks.

show abstract