The extent to which phylogenetic diversity (PD) captures feature diversity (FD) is a 1 topical and controversial question in biodiversity conservation. In this short paper, we 2 formalise this question and establish a precise mathematical condition for FD (based on 3 discrete characters) to coincide with PD. In this way, we make explicit the two main 4 reasons why the two diversity measures might disagree for given data; namely, the presence 5 of certain patterns of feature evolution and loss, and using temporal branch lengths for PD 6 in settings that may not be appropriate (e.g. due to rapid evolution of certain features over 7 short periods of time). Our paper also explores the relationship between the 'Fair 8 Proportion' index of PD and a simple index of FD (both of which correspond to Shapley 9 values in cooperative game theory). In a second mathematical result, we show that the two 10 indices can take identical values for any phylogenetic tree, provided the branch lengths in 11 the tree are chosen appropriately. 12 value 14 16 Almost 30 years ago, Dan Faith published a seminal paper that laid out how 17 phylogenies might aid in identifying sets of species with maximal "feature diversity" 18 (Faith, 1992). Faith's stated goal was to support practical biodiversity conservation in the 19 face of limited resources, coupled with the assumption that maximising feature diversity 20 (the total number of unique character states represented by a set of taxa) was a desirable 21 conservation target. 22 Drawing on the call of Vane-Wright et al. (1991) to consider taxonomic 23 distinctiveness when prioritizing species, Faith introduced the phylogenetic diversity (PD) 24 metric, simply the sum of the edge lengths of the minimal subtree linking a subset of 25 species to the root of the encompassing phylogeny (also called the 'minimum spanning 26 path' by Faith (1992)). Importantly, these edge lengths were given in units of 27 reconstructed character changes under maximum parsimony on the cladogram representing 28 a character state matrix with no homoplasy. Faith showed, with an example, that the sum 29 of these reconstructed edge lengths would lead to the same total feature diversity as that 30 calculated from the character matrix itself. Importantly, if these cladistic edge lengths are 31 representative of all features, then maximising PD (e.g. over a given subset size) would 32 maximise feature diversity, even in the face of some homoplasy. The bulk of Faith's 1992 33 paper was devoted to introducing the machinery to maximise PD. 34 Efficient algorithms for finding maximum PD sets are available (Bordewich et al. 35 (2008)), the metric has been extended to networks (Minh et al. (2009)), and there are 36 countless case studies that both measure and optimize PD for conservation (see, e.g., 37 Pollock et al. (2017)); Faith's original paper has been cited in excess of 2000 times. A 38 recent review (Tucker et al., 2019) considered the literature concerning both the empirical 39 correlations between PD and feature diversity, a...