Many viruses use overprinting (alternate reading frame utilization) as a means to increase protein diversity in genomes severely constrained by size. However, the evolutionary steps that facilitate the de novo generation of a novel protein within an ancestral ORF have remained poorly characterized. Here, we describe the identification of an overprinting gene, expressed from an Alternate frame of the Large T Open reading frame (ALTO) in the early region of Merkel cell polyomavirus (MCPyV), the causative agent of most Merkel cell carcinomas. ALTO is expressed during, but not required for, replication of the MCPyV genome. Phylogenetic analysis reveals that ALTO is evolutionarily related to the middle T antigen of murine polyomavirus despite almost no sequence similarity. ALTO/MT arose de novo by overprinting of the second exon of T antigen in the common ancestor of a large clade of mammalian polyomaviruses. Taking advantage of the low evolutionary divergence and diverse sampling of polyomaviruses, we propose evolutionary transitions that likely gave birth to this protein. We suggest that two highly constrained regions of the large T antigen ORF provided a start codon and C-terminal hydrophobic motif necessary for cellular localization of ALTO. These two key features, together with stochastic erasure of intervening stop codons, resulted in a unique protein-coding capacity that has been preserved ever since its birth. Our study not only reveals a previously undefined protein encoded by several polyomaviruses including MCPyV, but also provides insight into de novo protein evolution. gene evolution | synonymous substitution | disordered motifs T he birth of new genes has fascinated biologists for decades. Although the steps required to generate a new gene by gene duplication or gene rearrangement have been characterized, less is known about the birth of new genes de novo. One particularly intriguing mechanism of de novo gene birth is via "overprinting," in which a novel overprinting gene is encoded as an alternate ORF within an ancestral "overprinted" gene (1). Overprinting results in two unrelated functional proteins encoded as overlapping ORFs within the same DNA sequence. However, the origins of such a complex evolutionary solution have remained elusive.Viruses appear to be especially adept at this form of evolutionary innovation. This frequent use of overprinting is likely the result of the severe constraints imposed on viral genome size, making gene innovation more likely to occur as overprinting rather than within a noncoding region (2). Due to the numerous examples of overprinting in single-stranded RNA viruses, a great deal of research has focused in particular on this class of viruses (3-6). However, small DNA viruses, such as adenoviruses, papillomaviruses, and polyomaviruses, have a similar requirement to maximize the coding capacity of their genomes. In this study, we have taken advantage of our identification of an overprinting gene born in the ancestor of a large clade of polyomaviruses to investigate the...