Host genomes have acquired diversity from viruses through the capture of viral elements, often from endogenous retroviruses (ERVs). These viral elements contribute new transcriptional control elements and new protein encoding genes, and their refinement through evolution can generate novel physiological functions for the host. EnvP(b)1 is an endogenous retroviral envelope gene found in human and other primate genomes.We show that EnvP(b)1 arose very early in the evolution of primates, i.e. at least 40-47 million years ago, but has nevertheless retained its ability to fuse primate cells. We have detected similar sequences in the genome of a lemur species, suggesting that a progenitor virus may have circulated 55+ million years ago. We demonstrate that EnvP(b)1 protein is expressed in multiple human tissues and is fully processed, rendering it competent to fuse cells. This activated fusogen is expressed in multiple healthy human tissues and is under purifying selection, suggesting that its expression is selectively advantageous. We determined a structure of the inferred receptor binding domain of human EnvP(b)1, revealing close structural similarities between this Env protein and those of currently circulating leukemia viruses, despite poor sequence conservation. This observation highlights a common scaffold from which novel receptor binding specificities have evolved. The evolutionary plasticity of this domain may underlie the diversity of related Envs in circulating viruses and coopted elements alike. The function of EnvP(b)1 in primates remains unknown.
SIGNIFICANCE STATEMENTOrganisms can access genetic and functional novelty by capturing viral elements within their genomes, where they can evolve to drive new cellular or organismal processes. We demonstrate that a retrovirus envelope gene, EnvP(b)1, has been maintained as a functional protein for 40 to ≥55 million years and is expressed as a protein in multiple healthy human tissues. We believe it has an unknown function in primates. We determined the structure of its inferred receptor binding domain and compared it with the same domain in modern viruses.We find a common conserved architecture that underlies the varied receptor binding activity of divergent Env genes. The modularity and versatility of this domain may underpin the evolutionary success of this clade of fusogens.