12Interspecific hybridization events have played a major role in plant speciation, yet, the 13 evolutionary origin of hybrid species often remains enigmatic. Here, we inferred the 14 evolutionary origin of the allotetraploid species Coffea arabica, which is widely cultivated for 15 Arabica coffee production. 16 We estimated genetic distances between C. arabica and all species that are known to be closely 17 related to C. arabica using genotyping-by-sequencing (GBS) data. In addition, we 18 reconstructed a time-calibrated multilabeled phylogenetic tree of 24 species to infer the age of 19 the C. arabica hybridization event. Ancestral states of self-compatibility were also 20 reconstructed to infer the evolution of self-compatibility in Coffea.
21C. canephora and C. eugenioides were confirmed as the putative progenitor species of C. 22 arabica. These species most likely hybridized between 1.08 million and 543 thousand years 23 ago. 24 We inferred the phylogenetic relationships between C. arabica and its closest relatives and shed 25 new light on the evolution of self-compatibility in Coffea. Furthermore, the age of the 26 hybridization event coincides with periods of environmental upheaval, which may have induced 27 range shifts of the progenitor species that facilitated the emergence of C. arabica. 28 29 2006). Using genomic in-situ hybridization (GISH) and Restriction fragment length 1 polymorphism (RFLP) markers, C. canephora Pierre ex A.Froehner and C. eugenioides 2 S.Moore have been identified as the closest extant relatives of C. arabica (Lashermes et al., 3 1999). Although cytogenetic methods such as GISH are considered reliable for studying 4 hybridization (Chester et al., 2010), a certain ambiguity remains regarding the progenitor 5 species of C. arabica. Based on GISH and fluorescence in-situ hybridization (FISH), Raina et 6 al. (1998) suggested C. congensis A.Froehner as progenitor species of C. arabica instead of C. 7 canephora. Hamon et al. (2009), however, could not discriminate between C. canephora and 8 C. congensis as putative progenitor of C. arabica using FISH and fluorochrome banding (CMA, 9 DAPI). Moreover, genetic divergence in plastid DNA regions or in the internal transcribed 10 spacer (ITS) sequence were too low to resolve phylogenetic relationships between C. arabica 11 and other Coffea species (