Ganesha Upadhyaya scite author profile

Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design ExampleCheck, an API usage mining framework that extracts patterns from over 380K Java repositories on GitHub and subsequently reports potential API usage violations in Stack Overflow posts. We analyze 217,818 Stack Overflow posts using ExampleCheck and find that 31% may have potential API usage violations that could produce unexpected behavior such as program crashes and resource leaks. Such API misuse is caused by three main reasons---missing control constructs, missing or incorrect order of API calls, and incorrect guard conditions. Even the posts that are accepted as correct answers or upvoted by other programmers are not necessarily more reliable than other posts in terms of API misuse. This study result calls for a new approach to augment Stack Overflow with alternative API usage details that are not typically shown in curated examples.ABSTRACT Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design ExampleCheck, an API usage mining framework that extracts patterns from over 380K Java repositories on GitHub and subsequently reports potential API usage violations in Stack Overflow posts. We analyze 217,818 Stack Overflow posts using ExampleCheck and find that 31% may have potential API usage violations that could produce unexpected behavior such as program crashes and resource leaks. Such API misuse is caused by three main reasons-missing control constructs, missing or incorrect order of API calls, and incorrect guard conditions. Even the posts that are accepted as correct answers or upvoted by other programmers are not necessarily more reliable than other posts in terms of API misuse. This study result calls for a new approach to augment Stack Overflow with alternative API usage details that are not typically shown in curated examples. CCS CONCEPTS• General and reference → Empirical studies; • Software and its engineering → Software reliability; Collaboration in software development;KEYWORDS online Q&A forum, API usage pattern, code example assessment ACM Reference Format:

show abstract

On ordering problems in message passing software

Long

Bagherzadeh

Lin

et al. 2016

View full text Add to dashboard Cite

The need for concurrency in modern software is increasingly fulfilled by utilizing the message passing paradigm because of its modularity and scalability. In the message passing paradigm, concurrently running processes communicate by sending and receiving messages. Asynchronous messaging introduces the possibility of message ordering problems: two messages with a specific order in the program text could take effect in the opposite order in the program execution and lead to bugs that are hard to find and debug. We believe that the engineering of message passing software could be easier if more is known about the characteristics of message ordering problems in practice. In this work, we present an analysis to study and quantify the relation between ordering problems and semantics variations of their underlying message passing paradigm in over 30 applications. Some of our findings are as follows: (1) semantic variations of the message passing paradigm can cause ordering problems exhibited by applications in different programming patterns to vary greatly; (2) some semantic features such as in-order messaging are critical for reducing ordering problems; (3) modular enforcement of aliasing in terms of data isolation allows small test configurations to trigger the majority of ordering problems. Disciplines Computer Sciences | Software Engineering Comments

show abstract

An Automatic Actors to Threads Mapping Technique for JVM-Based Actor Frameworks

Upadhyaya

Rajan

2014

View full text Add to dashboard Cite

Actor frameworks running on Java Virtual Machine (JVM) platform face two main challenges in utilizing multi-core architectures, i) efficiently mapping actors to JVM threads, and ii) scheduling JVM threads on multi-core. JVM-based actor frameworks allow fine tuning of actors to threads mapping, however scheduling of threads on multi-core is left to the OS scheduler. Hence, efficiently mapping actors to threads is critical for achieving good performance and scalability. In the existing JVM-based actor frameworks, programmers select default actors to threads mappings and iteratively fine tune the mappings until the desired performance is achieved. This process is tedious and time consuming when building large scale distributed applications. We propose a technique that automatically maps actors to JVM threads. Our technique is based on a set of heuristics with the goal of balancing actors computations across JVM threads and reducing communication overheads. We explain our technique in the context of the Panini programming language, which provides capsules as an actor-like abstraction for concurrency, but also explore its applicability to other approaches.

show abstract

Candoia: A Platform for Building and Sharing Mining Software Repositories Tools as Apps

Tiwari

Upadhyaya

Nguyen

et al. 2017

View full text Add to dashboard Cite

We propose Candoia, a novel platform and ecosystem for building and sharing Mining Software Repositories (MSR) tools. Using Candoia, MSR tools are built as apps and Candoia ecosystem, acting as an appstore, allows effective sharing. Candoia platform provides, data extraction tools for curating custom datasets for user projects, and data abstractions for enabling uniform access to MSR artifacts from disparate sources, which makes apps portable and adoptable across diverse software project settings of MSR researchers and practitioners. The structured design of a Candoia app and the languages selected for building various components of a Candoia app promotes easy customization. To evaluate Candoia we have built over two dozen MSR apps for analyzing bugs, software evolution, project management aspects, and source code and programming practices showing the applicability of the platform for building a variety of MSR apps. For testing portability of apps across diverse project settings, we tested the apps using ten popular project repositories, such as Apache Tomcat, JUnit, Node.js, etc, and found that apps required no changes to be portable. We performed a user study to test customizability and we found that five of eight Candoia users found it very easy to customize an existing app. Candoia is available for download. Disciplines Computer Sciences | Software Engineering Comments This article is published as Tiwari, Nitin M., Ganesha Upadhyaya, Hoan Anh Nguyen, and Hridesh Rajan. "Candoia: a platform for building and sharing mining software repositories tools as apps." In Proceedings of the 14th International Conference on Mining Software Repositories, pp. 53-63. IEEE Press, 2017. Posted with permission. Rights

show abstract

On Accelerating Source Code Analysis at Massive Scale

Upadhyaya

Rajan

2018

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Encouraged by the success of data-driven software engineering (SE) techniques that have found numerous applications e.g. in defect prediction, specification inference, etc, the demand for mining and analyzing source code repositories at scale has significantly increased. However, analyzing source code at scale remains expensive to the extent that data-driven solutions to certain SE problems are beyond our reach today. Extant techniques have focussed on leveraging distributed computing to solve this problem, but with a concomitant increase in computational resource needs. This work proposes a technique that reduces the amount of computation performed by the ultra-large-scale source code mining task. Our key idea is to analyze the mining task to identify and remove the irrelevant portions of the source code, prior to running the mining task. We show a realization of our insight for mining and analyzing massive collections of control flow graphs of source codes. Our evaluation using 16 classical control-/data-flow analyses that are typical components of mining tasks and 7 Million CFGs shows that our technique can achieve on average 40% reduction in the task computation time. Our case studies demonstrates the applicability of our technique to massive scale source code mining tasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.