Android, the #1 mobile app framework, enforces the single-GUIthread model, in which a single UI thread manages GUI rendering and event dispatching. Due to this model, it is vital to avoid blocking the UI thread for responsiveness. One common practice is to offload long-running tasks into async threads. To achieve this, Android provides various async programming constructs, and leaves developers themselves to obey the rules implied by the model. However, as our study reveals, more than 25% apps violate these rules and introduce hard-to-detect, fail-stop errors, which we term as aysnc programming errors (APEs). To this end, this paper introduces APEChecker, a technique to automatically and efficiently manifest APEs. The key idea is to characterize APEs as specific fault patterns, and synergistically combine static analysis and dynamic UI exploration to detect and verify such errors. Among the 40 real-world Android apps, APEChecker unveils and processes 61 APEs, of which 51 are confirmed (83.6% hit rate). Specifically, APEChecker detects 3X more APEs than the state-of-art testing tools (Monkey, Sapienz and Stoat), and reduces testing time from half an hour to a few minutes. On a specific type of APEs, APEChecker confirms 5X more errors than the data race detection tool, EventRacer, with very few false alarms.
CCS CONCEPTS• Software and its engineering → Software testing and debugging;such fatal programming errors that violate the rules implied by the single-UI-thread model as async programming errors (APEs).Such bugs in Android are not easy to detect manually, due to (1) they usually reside in the code of handling interactions between UI thread and async threads, which can be rather complicated for manual analysis; (2) they can only be triggered at the right states of GUI components (e.g., activity, fragment) with complicated lifecycle [21,30]; (3) they have to be triggered at right thread scheduling, while the execution time of async threads is affected by the task and its running environment (e.g., network stability, system load).Even worse, existing bug detection techniques are ineffective for such bugs. First, most GUI testing techniques, e.g., random testing [39,57], search-based testing [58,60], and model-based testing [2,3,8,78,85], are designed for functional testing in general. They aim at enumerating all possible event sequences (GUI-level events in particular) to manifest bugs, which is unscalable and time-consuming. Additionally, they mainly aim at improving code coverage, which may not be sufficient for exhibiting APEs -require specific event sequences with appropriate lifecycle states and thread scheduling. Second, static analysis tools, e.g., Lint [35], Find-Bugs [19] and PMD [67], although scalable, only enforce simple rules (syntax or trivial control/data-flow analysis) to locate suspicious bugs. For example, Lint declares it can find "WrongThread" errors (one type of APEs) [24]. However, as our evaluation in Section 5 demonstrates, Lint incurs a number of false negatives -failing to detect those so...