The notion of features is commonly used to describe, structure, and communicate the functionalities of a system. Unfortunately, features and their locations in software artifacts are rarely made explicit and often need to be recovered by developers. To this end, researchers have conceived automated feature-location techniques. However, their accuracy is generally low, and they mostly rely on few information sources, disregarding the richness of modern projects. To improve such techniques, we need to improve the empirical understanding of features and their characteristics, including the information sources that support feature location. Even though, the product-line community has extensively studied features, the focus was primarily on variable features in preprocessor-based systems, largely side-stepping mandatory features, which are hard to identify. We present an exploratory case study on identifying and locating features. We study what information sources reveal features and to what extent, compare the characteristics of mandatory and optional features, and formulate hypotheses about our observations. Among others, we find that locating features in code requires substantial domain knowledge for half of the mandatory features (e.g., to connect keywords) and that mandatory and optional features in fact differ. For instance, mandatory features are less scattered. Other researchers can use our manually created data set of features locations for future research, guided by our formulated hypotheses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.