“…The underlying assumption of generative protein models is that natural proteins are under evolutionary pressure to be functional, so novel sequences drawn from the same distribution will also be functional 17 . Multiple different generative protein models have been proposed, including methods based on deep neural networks such as generative adversarial networks (GANs) 15 , variational auto-encoders (VAEs) 16,18 , and language models 13,14,[19][20][21][22] , and other neural networks 23,24 , as well as statistical methods such as ancestral sequence reconstruction (ASR) 25,26 and direct coupling analysis (DCA) [27][28][29] . However, comparing the performance of these methods, i.e.…”