Object detection for graphical user interface: old fashioned or deep learning or a combination?

Chen, Jieshan; Xie, Mulong; Xing, Zhenchang; Chen, Chunyang; Xu, Xiwei; Zhu, Liming; Li, Guoqiang

doi:10.1145/3368089.3409691

Cited by 115 publications

(47 citation statements)

References 51 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…For instance, the OutSpoken screen reader for Windows 3.1 allowed users to label icons on the screen, which it then recognizes from their pixels alone [69]. Inferring information from pixels of interfaces has been applied in diverse applications such as interface augmentation and remapping [11,17,30,79], GUI testing [78], data-driven design for GUI search [20,22,45] or prototyping [70], generating UI code from existing apps to support app development [12,21,24,53,57], and GUI security [25]. Some work also employed pixel-based methods to improve accessibility, such as Prefab, which augments existing app interface with targetaware pointing techniques that enhance interaction for people with motor impairments [31].…”

Section: Ui Detection From Pixelsmentioning

confidence: 99%

“…There are multiple approaches to pixel-based interpretation of interfaces. Recent work by Chen et al [24] categorizes and evaluates two major GUI detection approaches: using traditional image processing methods (e.g., edge/contour detection [57], template matching [30,61,78]) and using deep learning models trained on large-scale GUI data [21,24].…”

Section: Ui Detection From Pixelsmentioning

confidence: 99%

“…In our work, we built and integrated a model to predict the clickability of some UI elements based on similar features (i.e.,, size, location, icon type); however, our goal is to predict the actual clickability of UI elements rather than the human perception; as [71] shows, these two often mismatch. Beyond previous work to detect UI elements from screenshots [12,24,53,57], we additionally detect the selection state for relevant UI element types (e.g., Toggle, Checkbox) by inferring additional UI element type classes.…”

Section: Understanding Ui Semanticsmentioning

confidence: 99%

See 2 more Smart Citations

Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels

Zhang

Greef

Swearngin

et al. 2021

Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

Many accessibility features available on mobile platforms require applications (apps) to provide complete and accurate metadata describing user interface (UI) components. Unfortunately, many apps do not provide sucient metadata for accessibility features to work as expected. In this paper, we explore inferring accessibility metadata for mobile apps from their pixels, as the visual interfaces often best reect an app's full functionality. We trained a robust, fast, memory-ecient, on-device model to detect UI elements using a dataset of 77,637 screens (from 4,068 iPhone apps) that we collected and annotated. To further improve UI detections and add semantic information, we introduced heuristics (e.g., UI grouping and ordering) and additional models (e.g., recognize UI content, state, interactivity). We built Screen Recognition to generate accessibility metadata to augment iOS VoiceOver. In a study with 9 screen reader users, we validated that our approach improves the accessibility of existing mobile apps, enabling even previously inaccessible apps to be used. CCS CONCEPTS• Human-centered computing ! Accessibility technologies.

show abstract

Section: Ui Detection From Pixelsmentioning

confidence: 99%

Section: Ui Detection From Pixelsmentioning

confidence: 99%

Section: Understanding Ui Semanticsmentioning

confidence: 99%

See 1 more Smart Citation

Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels

Zhang

Greef

Swearngin

et al. 2021

Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

show abstract

“…Fischer et al [53] transfer the style from fine art to GUI. Chen et al [54] study different GUI element detection methods on large-scale GUI data and develop UIED [55] to handle diverse and complicated GUI images. Other supporting works such as GUI tag prediction [56] and GUI component gallery construction [57] can enhance designers' searching efficiency.…”

Section: A Gui Designmentioning

confidence: 99%

GUIGAN: Learning to Generate GUI Designs Using Generative Adversarial Networks

Zhao

Chen

Liu

et al. 2021

2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)

Self Cite

View full text Add to dashboard Cite

Graphical User Interface (GUI) is ubiquitous in almost all modern desktop software, mobile applications, and online websites. A good GUI design is crucial to the success of the software in the market, but designing a good GUI which requires much innovation and creativity is difficult even to well-trained designers. Besides, the requirement of the rapid development of GUI design also aggravates designers' working load. So, the availability of various automated generated GUIs can help enhance the design personalization and specialization as they can cater to the taste of different designers. To assist designers, we develop a model GUIGAN to automatically generate GUI designs. Different from conventional image generation models based on image pixels, our GUIGAN is to reuse GUI components collected from existing mobile app GUIs for composing a new design that is similar to natural-language generation. Our GUIGAN is based on SeqGAN by modeling the GUI component style compatibility and GUI structure. The evaluation demonstrates that our model significantly outperforms the best of the baseline methods by 30.77% in Frechet Inception distance (FID) and 12.35% in 1-Nearest Neighbor Accuracy (1-NNA). Through a pilot user study, we provide initial evidence of the usefulness of our approach for generating acceptable brand new GUI designs.

show abstract

“…We use computer-vision techniques to achieve these two goals. In par ticular, we use the UI widget detection tool [23] to detect non text UI widget regions (e.g., icon, button), and use EAST [20] to detect text regions. The detected icons and images help to validate the guidelines regarding icon usage, such as " don't use same icons to represent different destinations in navigation drawer" .…”

Section: Parsing Input Ui Design Imagementioning

confidence: 99%

Don’t Do That! Hunting Down Visual Design Smells in Complex UIs Against Design Guidelines

Yang

Xing

Xia

et al. 2021

2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)

Self Cite

View full text Add to dashboard Cite

Just like code smells in source code, UI design has visual design smells. We study 93 don t-do-that guidelines in the Material Design, a complex design system created by Google. We find that these don t-guidelines go far beyond U I aesthetics, and involve seven general design dimensions (layout, typography, iconography, navigation, communication, color, and shape) and four component design aspects (anatomy, placement, behavior, and usage). Violating these guidelines results in visual design smells in UIs (or U I design smells). In a study of 60,756 UIs of 9,286 Android apps, we find that 7,497 UIs of 2,587 apps have at least one violation of some Material Design guidelines. This reveals the lack of developer training and tool support to avoid U I design smells. To fill this gap, we design an automated UI design smell detector (UIS-Hunter) that extracts and validates multi-modal U I information (component metadata, typography, iconography, color, and edge) for detecting the violation of diverse don t-guidelines in Material Design. The detection accuracy of UIS-Hunter is high (precision=0.81, recall=0.90) on the 60,756 UIs of 9,286 apps. We build a guideline gallery with real-world U I design smells that UIS-Hunter detects for developers to learn the best Material Design practices. Our user studies show that UIS-Hunter is more effective than manual detection of UI design smells, and the U I design smells that are detected by UIS-Hunter have severely negative impacts on app users.

show abstract

Object detection for graphical user interface: old fashioned or deep learning or a combination?

Cited by 115 publications

References 51 publications

Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels

Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels

GUIGAN: Learning to Generate GUI Designs Using Generative Adversarial Networks

Don’t Do That! Hunting Down Visual Design Smells in Complex UIs Against Design Guidelines

Contact Info

Product

Resources

About