Software evolution studies have traditionally focused on individual products. In this study we scale up the idea of software evolution by considering software compilations composed of a large quantity of independently developed products, engineered to work together. With the success of libre (free, open source) software, these compilations have become common in the form of 'software distributions', which group hundreds or thousands of software applications and libraries into an integrated system. We have performed an exploratory case study on one of them, Debian GNU/Linux, finding some significant results. First, Debian has been doubling in size every 2 years, totalling about 300 million lines of code as of 2007. Second, the mean size of packages has remained stable over time. Third, the number of dependencies between packages has been growing quickly. Finally, while C is still by far the most commonly used programming language for applications, use of the C++, Java, and Python languages have all significantly increased. The study helps not only to understand the evolution of Debian, but also yields insights into the evolution of mature libre software systems in general.
Libre (free / open source) software is a good example of global software development. Thousands of projects, in a wide range of domains which involve hundreds of thousands of developers and contributors from all around the world. Some large (both in size and developer community) libre software projects have shown evidence of producing code with complete functionality and fast evolution (with linear or superlinear growth), while maintaining low defect density. Many companies are exploring how to benefit from this situation, considering several approaches related to libre software development. For instance, some of them have hired full-time developers, focusing their work on some libre software projects they consider strategic.However, before joining the core development team of the project, these hired developers have to follow a process of software comprehension, and get used to the rules and communication mechanisms used in the project. We were interested in the differences between this case and that of volunteer developers working in the same project, Therefore, we studied the duration and basic characteristics of this joining process for the developers of GNOME (a well known, large, libre software project). In our analysis, we have found two groups with clearly different joining patterns. Moreover, we have related those patterns to the different behaviors of volunteer and hired developers: volunteers tend to follow a step-by-step joining process, while hired developers usually experience a "sudden" integration. Some reasons for this different behavior are also discussed.
During the latest years libre (free, open source) software has gained a lot of attention from the industry. Following this interest, the research community is also studying it. For instance, many teams are performing quantitative analysis on the large quantity of data which is publicly available from the development repositories maintained by libre software projects. However, not much of this research is focused on cost or effort estimations, despite its importance (for instance, for companies developing libre software or collaborating with libre software projects), and the availability of some data which could be useful for this purpose. Our position is that classical effort estimation models can be improved from the study of these data, at least when applied to libre software. In this paper, we focus on the characterization of developer activity, which we argue can improve effort estimation. This activity can be traced with a lot of detail, and the resulting data can also be used for validation of any effort estimation model.
With the success of libre (free, open source) software, a new type of software compilation has become increasingly common. Such compilations, often referred to as 'distributions', group hundreds, if not thousands, of software applications and libraries written by independent parties into an integrated system. Software compilations raise a number of questions that have not been targeted so far by software evolution, which usually focuses on the evolution of single applications. Undoubtedly, the challenges that software compilations face differ from those found in single software applications. Nevertheless, it can be assumed that both, the evolution of applications and that of software compilations, have similarities and dependencies.In this sense, we identify a dichotomy, common to that in economics, of software evolution in the small (microevolution) and in the large (macro-evolution). The goal of this paper is to study the evolution of a large software compilation, mining the publicly available repository of a well-known Linux distribution, Debian. We will therefore investigate changes related to hundreds of millions of lines of code over seven years. The aspects that will be covered in this paper are size (in terms of number of packages and of number of lines of code), use of programming languages, maintenance of packages and file sizes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.