The erodibility of bedrock and rock masses is an important parameter for understanding landform development, landscape evolution modelling and engineering applications. Yet, complex geotechnical properties and the difficulty of directly quantifying erodibility limit the theoretical understanding and prediction of erosion processes. Several proxy methods have been suggested to assess bedrock erodibility by fluvial impact erosion. Yet, none of these proxy methods have been rigorously benchmarked with direct laboratory or field measurements. Here, we assess the usefulness of proxy methods described in the literature in the quantitative prediction of fluvial impact erosion. We compare four proxy methods – Mohs' hardness, the Schmidt hammer rebound value, Annandale's erodibility index and the Selby score – to erodibility laboratory data measured using erosion mills. We assess these methods using three statistical parameters: Kendall's tau and Spearman's rho rank correlation coefficients, and the adjusted R2 from an exponential fit. We distinguish between three applications, which require increasing correlation strength. These are (i) trend detection (sorting groups of data by their relative erodibility), (ii) quantitative ranking (relative erodibility of groups of data can be quantitatively assessed), and quantitative prediction (erodibility for individual sites can be quantitatively assessed). Mohs' hardness, Schmidt hammer measurements and Annandale's method are suitable for trend detection, while Selby's method is not. None of the methods is suitable for quantitative prediction. As such, none of the methods is a suitable proxy for estimating erodibility in fluvial bedrock erosion at a particular location. For quantitative ranking, we suggest to use either Mohs' hardness or Schmidt hammer measurements, because of (i) the correlation with mill‐measured erodibility, (ii) their ease and quickness of application in the field and (iii) the minimum of required training. When applying these methods, investigators should obtain data both from the same and from different lithological units at many sites. Then, the results can then be used for bulk assessment, but not for individual sites.