Software ecosystems have had a tremendous impact on computing and society, capturing the attention of businesses, researchers, and policy makers alike. Massive ecosystems like the JavaScript node package manager (npm) is evidence of how packages are readily available for use by software projects. Due to its high-dimension and complex properties, software ecosystem analysis has been limited. In this paper, we leverage topological methods in visualize the high-dimensional datasets from a software ecosystem. Topological Data Analysis (TDA) is an emerging technique to analyze high-dimensional datasets, which enables us to study the shape of data. We generate the npm software ecosystem topology to uncover insights and extract patterns of existing libraries by studying its localities. Our realworld example reveals many interesting insights and patterns that describes the shape of a software ecosystem.
With over 28 million developers, success of the GitHub collaborative platform is highlighted through an abundance of communication channels among contemporary software projects. Knowledge is broken into two forms and its sharing (through communication channels) can be described as externalization or combination by the SECI model. Such platforms have revolutionized the way developers work, introducing new channels to share knowledge in the form of pull requests, issues and wikis. It is unclear how these channels capture and share knowledge. In this research, our goal is to analyze these communication channels in GitHub. First, using the SECI model, we are able to map how knowledge is shared through the communication channels. Then in a large-scale topology analysis of seven library package projects (i.e., involving over 70 thousand projects), we extracted insights of the different communication channels within GitHub. Using two research questions, we explored the evolution of the channels and adoption of channels by both popular and unpopular library package projects. Results show that (i) contemporary GitHub Projects tend to adopt multiple communication channels, (ii) communication channels change over time and (iii) communication channels are used to both capture new knowledge (i.e., externalization) and updating existing knowledge (i.e., combination).Recently, Aniche et al. [12] confirmed that news channels also play an important role in shaping and sharing knowledge among developers.In this paper, we investigate communication channels to understand how projects share knowledge at the software ecosystem level. Inspired by the knowledge-based theory of the firm [13], our hypothesis is to validate the underlying theory behind the transferable of knowledge within these library ecosystems, and to investigate how ecosystems influence social practices within and outside their ecosystems. To achieve our goal that is to analyze how communication channels share knowledge over projects, we first identify different knowledge forms of channels in over 210 thousand library projects from seven different library ecosystems. We then explore the evolution of these channels and distinguish differences between these seven ecosystems. Similar to a study by Lertwittayatrai et al. [14], we use topological data analysis to generate topologies that cover three years (i.e., 2015 to 2017). Using topology data analysis, the results of the study that (i) contemporary GitHub Projects tend to adopt multiple communication channels, (ii) communication channels change over time and (iii) communication channels are used to capture new knowledge (i.e., externalization) and updating existing knowledge.The contributions of the study are two-fold. First, we present a manual categorization of channels forms in software projects. The second contribution is a large-scale analysis of channels for software projects over seven ecosystems using the topological analysis of software library projects for seven different software ecosystems.The rest of the pa...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.