BackgroundDiversity-generating retroelements (DGRs) provide organisms with a unique means for adaptation to a dynamic environment through massive protein sequence variation. The potential scope of this variation exceeds that of the vertebrate adaptive immune system. DGRs were known to exist only in viruses and bacteria until their recent discovery in archaea belonging to the ‘microbial dark matter’, specifically in organisms closely related to Nanoarchaeota. However, Nanoarchaeota DGR variable proteins were unassignable to known protein folds and apparently unrelated to characterized DGR variable proteins.ResultsTo address the issue of how Nanoarchaeota DGR variable proteins accommodate massive sequence variation, we determined the 2.52 Å resolution limit crystal structure of one such protein, AvpA, which revealed a C-type lectin (CLec)-fold that organizes a putative ligand-binding site that is capable of accommodating 1013 sequences. This fold is surprisingly reminiscent of the CLec-folds of viral and bacterial DGR variable protein, but differs sufficiently to define a new CLec-fold subclass, which is consistent with early divergence between bacterial and archaeal DGRs. The structure also enabled identification of a group of AvpA-like proteins in multiple putative DGRs from uncultivated archaea. These variable proteins may aid Nanoarchaeota and these uncultivated archaea in symbiotic relationships.ConclusionsOur results have uncovered the widespread conservation of the CLec-fold in viruses, bacteria, and archaea for accommodating massive sequence variation. In addition, to our knowledge, this is the first report of an archaeal CLec-fold protein.