Cotton (Gossypium spp.) is an important natural textile ber and oilseed crop widely cultivated in the world. Lint percentage (LP, %) is one of the important yield factor, thus increasing lint percentage is a core goal of cotton breeding improvement. However, the underlying genetic and molecular mechanisms that control lint percentage in upland cotton remain largely unknown. Here, we performed a Genome-wide association study (GWAS) for LP based on phenotypic tests of 254 upland cotton accessions in four environments and BLUPs using the high-density CottonSNP80K array. A total of 41,413 high-quality singlenucleotide polymorphisms (SNPs) were screened and 34 SNPs within 22 QTLs were identi ed as signi cantly associated with lint percentage trait in different environments. In total, 175 candidate genes were identi ed from two major genomic loci (GR1 and GR2) of upland cotton and 50 hub genes were identi ed through GO enrichment and WGCNA analysis. Furthermore, two candidate/causal genes, Gh_D01G0162 and Gh_D07G0463, which pleiotropically increased lint percentage were identi ed and further veri ed its function through LD blocks, haplotypes and qRT-PCR analysis. Co-expression network analysis showed that the candidate/causal and hub gene, Gh_D07G0463, was signi cantly related to another candidate gene, Gh_D01G0162, and the simultaneous pyramid of the two genes lays the foundation for a more e cient increase in cotton production. Our study provides crucial insights into the genetic and molecular mechanisms underlying variations of yield traits and serves as an important foundation for lint percentage improvement via marker-assisted breeding.
Key MessageA total of 34 SNPs within 22 QTLs associated with lint percentage were identi ed by a GWAS. Two candidate genes underlying this trait were detected based on signi cant SNPs as well.