Star ratings that are given by the users of mobile apps directly impact the revenue of its developers. At the same time, for popular platforms like Android, these apps must run on hundreds of devices increasing the chance for device-specific problems. Devicespecific problems could impact the rating assigned to an app, given the varying capabilities of devices (e.g., hardware and software). To fix device-specific problems developers must test their apps on a large number of Android devices, which is costly and inefficient.Therefore, to help developers pick which devices to test their apps on, we propose using the devices that are mentioned in user reviews. We mine the user reviews of 99 free game apps and find that, apps receive user reviews from a large number of devices: between 38 to 132 unique devices. However, most of the reviews (80%) originate from a small subset of devices (on average, 33%). Furthermore, we find that developers of new game apps with no reviews can use the review data of similar game apps to select the devices that they should focus on first. Finally, among the set of devices that generate the most reviews for an app, we find that some devices tend to generate worse ratings than others. Our findings indicate that focusing on the devices with the most reviews (in particular the ones with negative ratings), developers can effectively prioritize their limited Quality Assurance (QA) efforts, since these devices have the greatest impact on ratings.
In the mobile app ecosystem, end user ratings of apps (a mea-sure of end user perception) are extremely important to study as they are highly correlated with downloads and hence revenues. In this study we examine the relationship between the app ratings (and associated review-comments) from end users with the static analysis warnings (collected using FindBugs) from 10,000 free-todownload Android apps. In our case study, we find that specific categories of FindBugs warnings such as the 'Bad Practice', 'Internationalization', and 'Performance' categories are found significantly more in low-rated apps. We also find that there exists a correspondence between these three categories of warnings and the complaints in the review-comments of end users. These findings provide evidence that certain categories of warnings from Find-Bugs have a strong relationship with the rating of an app and hence are closely related to the user experience. Thus app developers can use static analysis tools such as FindBugs to potentially identify the culprit bugs behind the issues that users complain about, before they release the app.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.