Previous genome-scale studies of populations living today in Ethiopia have found evidence of 1 recent gene flow from an Eurasian source, dating to the last 3,000 years 1,2,3,4 . Haplotype 1 2 and genotype data based analyses of modern 2,4 and ancient data (aDNA) 3,5 have considered 3 Sardinia-like proxy 2 , broadly Levantine 1,4 or Neolithic Levantine 3 populations as a range of 4 possible sources for this gene flow. Given the ancient nature of this gene flow and the extent 5 of population movements and replacements that affected West Asia in the last 3000 years, 6 aDNA evidence would seem as the best proxy for determining the putative population source. 7 We demonstrate, however, that the deeply divergent, autochthonous African component which 8 accounts for ∼50% of most contemporary Ethiopian genomes, affects the overall allele frequency 9 spectrum to an extent that makes it hard to control for it and, at once, to discern between 10 subtly different, yet important, Eurasian sources (such as Anatolian or Levant Neolithic ones).
11Here we re-assess pattern of allele sharing between the Eurasian component of Ethiopians (here 12 called "NAF" for Non African) and ancient and modern proxies area after having extracted NAF 13 from Ethiopians through ancestry deconvolution, and unveil a genomic signature compatible 14 with population movements that affected the Mediterranean area and the Levant after the fall 15 of the Minoan civilization.
16
Results and Discussion
17To determine the most likely source of the Eurasian gene flow into the ancestral gene pool of 18 present-day Ethiopians we have used a combination of ancestry deconvolution (AD) and allele 19 sharing methods 6 . AD refers to analyses that determine the likeliest ancestry composition of 20 genomes of individuals with mixed ancestry at fine haplotype resolution. These methods have 21 1 allowed us to i) exploit high quality modern data and ii) harness the power of allele sharing 22 tools on genetic fractions with no or reduced African contributions. Such a strategy, while 23 potentially beneficial, introduce a novel source of bias which we aimed to explore here. Par-24 ticularly, after AD of 120 Ethiopian genomes 7 , we assigned each genomic SNP into one of the 25 following four categories based on the method likelihoods (see Methods for further details): 1) 26 confidently non African (NAF); 2) low confidence non African (X); 3) low confidence African 27 (Y) and 4) confidently African (AF, consistently filtered out from our analyses). While basing 28 our inference on the NAF component alone, we here demonstrate that the component X does 29 account for a minority of the genome and, when analysed together with NAF does not quali-30 tatively change the results. Furthermore, when joining together the NAF and AF confidently 31 assigned components (to create "Joint" components) we recapitulate the signals of the global 32 population (prior to ancestry deconvolution), showing that the X and Y components are not 33 holding a considerable or peculiar genetic signat...