Software defects are a major nuisance in software development and can lead to considerable financial losses or reputation damage for companies. To this end, a large number of techniques for predicting software defects, largely based on machine learning methods, has been developed over the past decades. These techniques usually rely on code-structure and process metrics to predict defects at the granularity of typical software assets, such as subsystems, components, and files. In this paper, we systematically investigate feature-oriented defect prediction: predicting defects at the granularity of features-domain-entities that abstractly represent software functionality and often cross-cut software assets. Feature-oriented prediction can be beneficial, since: (i) particular features might be more error-prone than others, (ii) characteristics of features known as defective might be useful to predict other error-prone features, and (iii) feature-specific code might be especially prone to faults arising from feature interactions. We explore the feasibility and solution space for feature-oriented defect prediction. We design and investigate scenarios, metrics, and classifiers. Our study relies on 12 software projects from which we analyzed 13,685 bug-introducing and corrective commits, and systematically generated 62,868 training and test datasets to evaluate the designed classifiers, metrics, and scenarios. The datasets were generated based on the 13,685 commits, 81 releases, and 24, 532 permutations of our 12 projects depending on the scenario addressed. We covered scenarios, such as just-in-time (JIT) and cross-project defect prediction. Our results confirm the feasibility of feature-oriented defect prediction. We found the best performance (i.e., precision and robustness) when using the Random Forest classifier, with process and structure metrics. Surprisingly, we found high performance for single-project JIT (median AUROC ≥ 95 %) and release-level (median AUROC ≥ 90 %) defect prediction-contrary to studies that assert poor performance due to insufficient training data. Lastly, we found that a model trained on release-level data from one of the twelve projects could predict defect-proneness of features in the other eleven projects with median performance of 82 %, without retraining on the target projects. Our results suggest potential for defect-prediction model-reuse across projects, as well as more reliable defect predictions for developers as they modify or release software features.