6 7 Sumanth Kumar Mutte: Sumanth.mutte@wur.nl 8 * Corresponding author (Dolf Weijers): dolf.weijers@wur.nl ABSTRACT 21Protein oligomerization is a fundamental process to build complex functional modules. 22 Domains that facilitate the oligomerization process are diverse and widespread in nature 23 across all kingdoms of life. One such domain is the Phox and Bem1 (PB1) domain, which is 24 functionally (relatively) well understood in the animal kingdom. However, beyond animals, 25 neither the origin nor the evolutionary patterns of PB1-containing proteins are understood. 26 While PB1 domain proteins have been found in other kingdoms, including plants, it is unclear 27 how these relate to animal PB1 proteins. 28 To address this question, we utilized large transcriptome datasets along with the proteomes of 29 a broad range of species. We discovered eight PB1 domain-containing protein families in 30 plants, along with three each in Protozoa and Chromista and four families in Fungi. Studying 31 the deep evolutionary history of PB1 domains throughout eukaryotes revealed the presence of 32 at least two, but likely three, ancestral PB1 copies in the Last Eukaryotic Common Ancestor 33 (LECA). These three ancestral copies gave rise to multiple orthologues later in evolution. 34 Tertiary structural models of these plant PB1 families, combined with Random Forest based 35 classification, indicated family-specific differences attributed to the length of PB1 domain 36 and the proportion of β-sheets. 37 This study identifies novel PB1 families and reveals considerable complexity in the protein 38 oligomerization potential at the origin of eukaryotes. The newly identified relationships 39 provide an evolutionary basis to understand the diverse functional interactions of key 40 regulatory proteins carrying PB1 domains across eukaryotic life. 41 42 43 44 3 KEYWORDS 45 Phylogeny, protein oligomerization, plants, animals, homology modeling, random forest 46 47 BACKGROUND 48Protein-protein interaction is a basic and important mechanism that brings proteins 49 together in a functional module and thus allows the development of higher-order 50 functionalities. One of the versatile interaction domains that brings this modularity through 51 either dimerization or oligomerization is the PB1 domain. Initially, the two animal proteins, 52 p40 Phox and p67 Phox , were shown to interact through a novel motif that contains a stretch of 53 negatively charged amino acids [1]. In the same study, it was also shown that the yeast CELL 54 DIVISION CONTROL 24 (Cdc24) protein contains the same motif as found in p40 Phox , and 55 hence named as PC motif (for p40 Phox and Cdc24; Nakamura et al. 1998). Later, the BUD 56 EMERGENCE 1 (Bem1) protein in yeast was also found to have this motif, after which it has 57 been renamed as PB1 domain (for Phox and Bem1). The PB1 domain of Bem1 in yeast is 58 required for the interaction with Cdc24 to maintain cell polarity [2]. Later, in mammals, 59 many protein families were identified that contain a ...