Sex determination in grapevine evolved through a complex succession of switches in sexual systems. Phased genomes built with single molecule real-time sequencing reads were assembled for eleven accessions of cultivated hermaphrodite grapevines and dioecious males and females, including the ancestor of domesticated grapevine and other related wild species. By comparing the phased sex haplotypes, we defined the sex locus of the Vitis genus and identified polymorphisms spanning regulatory and coding sequences that are in perfect association with each sex-type throughout the genus. These findings identified a novel male-fertility candidate gene, INP1, and significantly refined the model of sex determination in Vitis and its evolution.Whole-sequence alignments of male (M) (a), female (F) (b), and hermaphrodite (H) (c) sex haplotypes on Vv vinifera Cabernet Sauvignon chromosome 2 Alt1 (H). d, From top to bottom, schematic representations of the sex locus in hermaphrodite Vv vinifera Cabernet Sauvignon (HF), male Vv sylvestris DVIT3351.27 (MF) and female Vv sylvestris DVIT3603.07 (FF), male V. arizonica (MF), and male M. rotundifolia (MF). A triangle along chromosome 2 marks the position of VVIB23, the genetic marker closely linked to the sex locus (Riaz et al., 2006). Genes affected by nonsense mutations are indicated with an "X".
Sex-linked polymorphisms affect protein sequencesComparison of the Vitis sex haplotypes also allowed the identification of SNPs and small INDELs (†50 bp) perfectly associated with sex. All sex-related polymorphisms were on chromosome 2 of Cabernet Sauvignon from positions 4,801,876 to 5,061,548 ( Fig. 3a; Supplementary Table 2), further confirming and delimiting the sex locus (Fechter et al., 2012;Picq et al., 2014). In total, 1,275 SNPs were shared by all F haplotypes versus H and M haplotypes and 539 SNPs were shared by all M haplotypes versus F and H haplotypes. A small number of SNPs (127) were linked to the H haplotype. Interestingly, the highest density of M-associated SNPs was in the first 8 kbp of the sex locus (176 SNPs,4,801,876 to 4,809,592), and the first F-associated SNP was 40 kbp downstream (4,842,196). Sex-specific SNP distributions were also similar when including M. rotundifolia haplotypes in the comparison, but the number of clear sexassociated SNPs decreased because of the divergence of the two genera (Extended Data Fig. 2). Many of the sex-linked SNPs we identified impact predicted protein sequences (Fig. 3b). In total, 89 nonsynonymous F-specific SNPs were detected in ten genes. These included one in a gene encoding a trehalose-6-phosphate phosphatase (TPP), one in INAPERTURE POLLEN1 (INP1), seven in a exostosincoding gene, three in a 3-ketoacyl-acyl carrier protein synthase III gene (KASIII), seven in a PLATZ transcription factor-coding gene (PLATZ), eighteen in the first FMO, twenty six in the second FMO, eleven in the third FMO, eleven in the hypothetical protein (VviFSEX), and four in the adenine phosphoribosyltransferase 3 (APT3) (Supplementary Table 2). Three of t...