“…In contrast to the orphan open reading frame (ORFan) proteins that are specific for particular species or strains and are subject to rapid loss (64,175,265), many proteins of unknown (or even known) functions are unique and distinctive, characteristic of various species from monophyletic clades of different phylogenetic depths (74,98,100,118,130,274). The presence of these proteins in a conserved state in all or most species and strains from these clades, but nowhere else, suggests that the genes for these proteins first evolved in a common ancestor of these clades, followed by their retention by various descendants (74,80,98,100,210). Thus, these proteins represent CSPs that are distinctive characteristics of particular lineages, and they provide useful molecular markers for defining or distinguishing those groups from other bacteria (118).…”