Functional test generation from UI test scenarios using reinforcement learning for android applications

Koroglu, Yavuz; Şen, Alper

doi:10.1002/stvr.1752

Cited by 19 publications

(14 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Constraint solving is modeled as a Markov Decision Process and the RL agent is first trained offline, then applied online. Learning, as used by [48,49], corrects this by using a different function to estimate future rewards. [45] adopted a deep convolutional NN to guide RL.…”

Section: Examining Specific Practicesmentioning

confidence: 99%

“…usage specifications [47,48], unique code functions called [50], a curiosity factorfavoring exploration of new elements [51,54]-coverage of interaction methods (e.g. click, drag) [46], and avoidance of navigation loops [44].…”

Section: Gui Test Generationmentioning

confidence: 99%

“…Rather than state coverage, [49] base reward on finding violations of specifications. [14] [ 55,56] use supervised ML to generate sequences of interactions.…”

Section: Gui Test Generationmentioning

confidence: 99%

“…RQ4 examines specific ML techniques. Some authors have chosen algorithms because they worked well in previous work (e.g., [48,101]). Others saw algorithms work on similar problems outside of test generation (e.g., [16]), or chose algorithms thought to represent the state-of-the-art for a problem class (e.g., [30]).…”

Section: Rq4: ML Techniques Appliedmentioning

confidence: 99%

See 3 more Smart Citations

The Integration of Machine Learning into Automated Test Generation: A Systematic Literature Review

Fontes¹,

Gay²

2022

Preprint

View full text Add to dashboard Cite

Context: Machine learning (ML) may enable effective automated test generation.Objectives: We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges.Methods: We perform a systematic literature review on a sample of 97 publications.Results: ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property-based, and expected output oracles. Supervised learning-often based on neural networks-and reinforcement learning-often based on Q-learning-are common, and some publications also employ unsupervised or semi-supervised learning. (Semi-/Un-)Supervised approaches are evaluated using both traditional testing metrics and ML-related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. Conclusion:Work-to-date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed-and how they are applied-benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.

show abstract

Section: Examining Specific Practicesmentioning

confidence: 99%

Section: Gui Test Generationmentioning

confidence: 99%

“…Rather than state coverage, [49] base reward on finding violations of specifications. [14] [ 55,56] use supervised ML to generate sequences of interactions.…”

Section: Gui Test Generationmentioning

confidence: 99%

Section: Rq4: ML Techniques Appliedmentioning

confidence: 99%

See 2 more Smart Citations

The Integration of Machine Learning into Automated Test Generation: A Systematic Literature Review

Fontes¹,

Gay²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…The first category leverages human-provided oracles to find non-crashing bugs in different strategies. These oracles are usually encoded in assertions (e.g., Thor [Adamsen et al 2015], ChimpCheck [Lam et al 2017, AppFlow [Hu et al 2018], ACAT [Rosenfeld et al 2018], AppTestMigrator [Behrang and Orso 2019], CraftDroid [Lin et al 2019]), linear-time temporal logic (LTL) formulas (e.g., FARLEAD-Android [Köroglu and Sen 2021]), or semantic models in Alloy (e.g., Augusto [Mariani et al 2018] which targets web applications). One special case in this category is the oracles derived from human-written app specifications.…”

Section: Related Workmentioning

confidence: 99%

Fully automated functional fuzzing of Android apps for detecting non-crashing logic bugs

Yan

Wang

et al. 2021

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

Android apps are GUI-based event-driven software and have become ubiquitous in recent years. Obviously, functional correctness is critical for an app’s success. However, in addition to crash bugs, non-crashing functional bugs (in short as “non-crashing bugs” in this work) like inadvertent function failures, silent user data lost and incorrect display information are prevalent, even in popular, well-tested apps. These non-crashing functional bugs are usually caused by program logic errors and manifest themselves on the graphic user interfaces (GUIs). In practice, such bugs pose significant challenges in effectively detecting them because (1) current practices heavily rely on expensive, small-scale manual validation ( the lack of automation ); and (2) modern fully automated testing has been limited to crash bugs ( the lack of test oracles ). This paper fills this gap by introducing independent view fuzzing , a novel, fully automated approach for detecting non-crashing functional bugs in Android apps. Inspired by metamorphic testing, our key insight is to leverage the commonly-held independent view property of Android apps to manufacture property-preserving mutant tests from a set of seed tests that validate certain app properties. The mutated tests help exercise the tested apps under additional, adverse conditions. Any property violations indicate likely functional bugs for further manual confirmation. We have realized our approach as an automated, end-to-end functional fuzzing tool, Genie. Given an app, (1) Genie automatically detects non-crashing bugs without requiring human-provided tests and oracles (thus fully automated ); and (2) the detected non-crashing bugs are diverse (thus general and not limited to specific functional properties ), which set Genie apart from prior work. We have evaluated Genie on 12 real-world Android apps and successfully uncovered 34 previously unknown non-crashing bugs in their latest releases — all have been confirmed, and 22 have already been fixed. Most of the detected bugs are nontrivial and have escaped developer (and user) testing for at least one year and affected many app releases, thus clearly demonstrating Genie’s effectiveness. According to our analysis, Genie achieves a reasonable true positive rate of 40.9%, while these 34 non-crashing bugs could not be detected by prior fully automated GUI testing tools (as our evaluation confirms). Thus, our work complements and enhances existing manual testing and fully automated testing for crash bugs.

show abstract

The integration of machine learning into automated test generation: A systematic mapping study

Fontes

Gay

2023

Software Testing Verif & Rel

View full text Add to dashboard Cite

Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property‐based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q‐learning—are common, and some publications also employ unsupervised or semi‐supervised learning. (Semi‐/Un‐)Supervised approaches are evaluated using both traditional testing metrics and ML‐related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work‐to‐date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.

show abstract

Functional test generation from UI test scenarios using reinforcement learning for android applications

Cited by 19 publications

References 41 publications

The Integration of Machine Learning into Automated Test Generation: A Systematic Literature Review

The Integration of Machine Learning into Automated Test Generation: A Systematic Literature Review

Fully automated functional fuzzing of Android apps for detecting non-crashing logic bugs

The integration of machine learning into automated test generation: A systematic mapping study

Contact Info

Product

Resources

About