Feature code is often scattered across a software system. Scattering is not necessarily bad if used with care, as witnessed by systems with highly scattered features that evolved successfully. Feature scattering, often realized with a pre-processor, circumvents limitations of programming languages and software architectures. Unfortunately, little is known about the principles governing scattering in large and long-living software systems. We present a longitudinal study of feature scattering in the Linux kernel, complemented by a survey with 74, and interviews with nine Linux kernel developers. We analyzed almost eight years of the kernel's history, focusing on its largest subsystem: device drivers. We learned that the ratio of scattered features remained nearly constant and that most features were introduced without scattering. Yet, scattering easily crosses subsystem boundaries, and highly scattered outliers exist. Scattering often addresses a performance-maintenance tradeoff (alleviating complicated APIs), hardware design limitations, and avoids code duplication. While developers do not consciously enforce scattering limits, they actually improve the system design and refactor code, thereby mitigating pre-processor idiosyncrasies or reducing its use. !
The notion of features is commonly used to describe, structure, and communicate the functionalities of a system. Unfortunately, features and their locations in software artifacts are rarely made explicit and often need to be recovered by developers. To this end, researchers have conceived automated feature-location techniques. However, their accuracy is generally low, and they mostly rely on few information sources, disregarding the richness of modern projects. To improve such techniques, we need to improve the empirical understanding of features and their characteristics, including the information sources that support feature location. Even though, the product-line community has extensively studied features, the focus was primarily on variable features in preprocessor-based systems, largely side-stepping mandatory features, which are hard to identify. We present an exploratory case study on identifying and locating features. We study what information sources reveal features and to what extent, compare the characteristics of mandatory and optional features, and formulate hypotheses about our observations. Among others, we find that locating features in code requires substantial domain knowledge for half of the mandatory features (e.g., to connect keywords) and that mandatory and optional features in fact differ. For instance, mandatory features are less scattered. Other researchers can use our manually created data set of features locations for future research, guided by our formulated hypotheses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.