Understanding evolution of plant immunity is necessary to inform rational approaches (Hall et al., 2009;Joshi et al., 2013).With over 50 fully sequenced plant genomes today, it is . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/100834 doi: bioRxiv preprint first posted online Jan. 20, 2017; Hotspots in Plant Immunity Gene Fusions 2 timely to apply comparative genomics approaches to investigate common trends in NLR evolution across the plant kingdom, including key crop species.In contrast to the highly conserved NB-ARC domains, the Leucine Rich Repeats (LRRs) of NLRs show high variability (Noel et al., 1999;Jacob, Vernaldi and Maekawa, 2013). The functional consequence of high LRR variation is thought to be the generation of novel recognition specificities (Bakker et al., 2006;Sukarta, Slootweg and Goverse, 2016). In addition, recent findings show that novel pathogen recognition specificities can also be acquired through the fusion of non-canonical domains to NLRs (Le Roux et al., 2015;Kroj et al., 2016). These exogenous domains can serve as 'baits' mimicking host targets of pathogen-derived effector molecules and therefore act in concert with LRR variation to broaden the spectra of recognised pathogenderived effectors (Cesari, Bernet al., 2014a; Cesari et al., 2014b;Le Roux et al., 2015).NLRs plant immune receptors were discovered over 20 years ago through cloning of plant disease resistance genes in Arabidopsis (Mindrinos et al., 1994; Bent et al., 1994) It is also clear that NLR(-ID) protein duplication has proliferated most strongly in these species for this hotspot clade ( Figure 1E). However, the relative ratio of NLRs with and without extra domains in this clade has remained relatively constant at around 59% suggesting that the rate of domain recycling has been constant across these species ( Figure 1B
Expansion of the NLR-ID hotspot 1 clade is linked to diversification through new gene fusionsThe NLR-ID proteins from evolutionary hotspot 1 were examined further to test the hypothesis that the increase in the number of NLRs with integrated domains was due to the creation of novel gene fusions rather than the duplication of existing ID fusions. A clear expansion of the ID domain repertoire was found for this group of proteins, particularly for the Triticeae species (Table 1). It is possible that differences in the observed repertoires can be explained partly by incomplete annotation of genomes or fragmented assembly of NLRs; such proteins were omitted from the phylogenetic analysis if they were < 70 % complete across the NB-ARC domain.However, the overall trend across the genomes strongly suggests that differences cannot be explained solely by differences in genome assemblies. Moreover, genomes such as B. distachyon, Z. mays and O. sativa are assembled to much higher quality than those of the Triticeae species, yet they contain fewer NLR-IDs and have lower ID di...