Polycystic ovary syndrome (PCOS) is a common metabolic disorder in women. To identify causative genes, we conducted a genome-wide association study (GWAS) of PCOS in Han Chinese. The discovery set included 744 PCOS cases and 895 controls; subsequent replications involved two independent cohorts (2,840 PCOS cases and 5,012 controls from northern Han Chinese; 498 cases and 780 controls from southern and central Han Chinese). We identified strong evidence of associations between PCOS and three loci: 2p16.3 (rs13405728; combined P-value by meta-analysis P(meta) = 7.55 × 10⁻²¹, odds ratio (OR) 0.71); 2p21 (rs13429458, P(meta) = 1.73 × 10⁻²³, OR 0.67); and 9q33.3 (rs2479106, P(meta) = 8.12 × 10⁻¹⁹, OR 1.34). These findings provide new insight into the pathogenesis of PCOS. Follow-up studies of the candidate genes in these regions are recommended.
Haplotypic information in diploid organisms provides valuable information on human evolutionary history and plays an important role in identifying a candidate gene in the etiology of complex genetic diseases. However, haplotypes of diploid individuals cannot be acquired easily. Molecular haplotyping methods are very costly and have low throughput, and current genotyping and sequencing methods do not provide information on the linkage phase in diploid organisms. The application of statistical methods to infer the haplotype phase in samples of diploid sequences is a very cost-effective approach. Several computational and statistical methods have been developed for haplotype inference, including Clark's algorithm [1], the Expectation Maximization (EM) algorithm [2], and Gibbs sampler [3]. Because of its interpretability and stability, the EM algorithm has become one of the most widely used statistical algorithms. However, the standard EM algorithm has several weaknesses, including the inability to handle a large number of markers and convergence to the local optimum. To overcome these problems, various derivative methods have been developed, such as the Partition-Ligation EM (PLEM) algorithm to handle many more linked loci [4], the Optimal Step Length EM (OSLEM) algorithm to accelerate the calculations [5], and the Stochastic EM (SEM) algorithm to deal with missing genotypic data and to avoid convergence to local maxima [6]. However, most packages are intended for use with single-nucleotide polymorphism (SNP) data in a biallelic manner.More and more researchers are analyzing both multiallelic and biallelic markers in the linkage and/or association studies of complex diseases. The analysis of linkage disequilibrium (LD) between multiallelic loci and haplotype inference of many loci (including bi-and multiallelic markers) present a number of common problems. The major difficulty for the haplotype inference problem npg
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.