Evaluating and predicting software maintenance effort using source code metrics is one of the holy grails of software engineering. Unfortunately, previous research has provided contradictory evidence in this regard. The debate is still open: as a community we are not certain about the relationship between code metrics and maintenance impact. In this study we investigate whether source code metrics can indeed establish maintenance effort at the previously unexplored method level granularity. We consider ∼730K Java methods originating from 47 popular open source projects. After considering seven popular method level code metrics and using change proneness as a maintenance effort indicator, we demonstrate why past studies contradict one another while examining the same data. We also show that evaluation context is king. Therefore, future research should step away from trying to devise generic maintenance models and should develop models that account for the maintenance indicator being used and the size of the methods being analyzed. Ultimately, we show that future source code metrics can be applied reliably and that these metrics can provide insight into maintenance effort when they are applied in a judiciously context-sensitive manner.