Breast cancer is the most common cancer in women in developed countries. To identify common breast cancer susceptibility alleles, we conducted a genome-wide association study in which 582,886 SNPs were genotyped in 3,659 cases with a family history of the disease and 4,897 controls. Promising associations were evaluated in a second stage, comprising 12,576 cases and 12,223 controls. We identified five new susceptibility loci, on chromosomes 9, 10 and 11 (P = 4.6 x 10(-7) to P = 3.2 x 10(-15)). We also identified SNPs in the 6q25.1 (rs3757318, P = 2.9 x 10(-6)), 8q24 (rs1562430, P = 5.8 x 10(-7)) and LSP1 (rs909116, P = 7.3 x 10(-7)) regions that showed more significant association with risk than those reported previously. Previously identified breast cancer susceptibility loci were also found to show larger effect sizes in this study of familial breast cancer cases than in previous population-based studies, consistent with polygenic susceptibility to the disease.
Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to play an important role in genetic susceptibility to common disease. To address this we undertook a large direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed ~19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated ~50% of all common CNVs larger than 500bp. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell-lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with disease, IRGM for Crohn's disease, HLA for Crohn's disease, rheumatoid arthritis, and type 1 diabetes, and TSPAN8 for type 2 diabetes, though in each case the locus had previously been identified in SNP-based studies, reflecting our observation that the majority of common CNVs which are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs which can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases.
We conducted a genome-wide association study for testicular germ cell tumor (TGCT), genotyping 307,666 SNPs in 730 cases and 1,435 controls from the UK and replicating associations in a further 571 cases and 1,806 controls. We found strong evidence for susceptibility loci on chromosome 5 (per allele OR = 1.37 (95% CI = 1.19-1.58), P = 3 × 10 −13 ), chromosome 6 (OR = 1.50 (95% = CI = 1.28-1.75), P = 10 −13 ) and chromosome 12 (OR = 2.55 (95% CI = 2.05-3.19), P = 10 −31 ). KITLG, encoding the ligand for the receptor tyrosine kinase KIT, which has previously been implicated in the pathogenesis of TGCT and the biology of germ cells, may explain the association on chromosome 12.Testicular germ cell tumor (TGCT) is the most common malignancy in men aged 15-45 years. The worldwide incidence of the disease is 7.5 per 100,000, but the rates vary considerably between countries and ancestry groups1. Known risk factors include a family history of the disease, previous germ cell tumor, subfertility, undescended testis (UDT)2 and testicular microlithiasis3, the presence of small foci of intratesticular calcification. There are two main subclasses of TGCT: seminomas show histological features of primordial germ cells, whereas nonseminomas show varying degrees of differentiation toward embryonal and extraembryonal structures. Some tumors show features of both classes. TGCTs are believed to arise from progenitor germ cells through a preinvasive phase of intratubular germ cell neoplasia (ITGCN)4. The peak incidence of nonseminomas is between the ages of 20 and 30 Several studies have estimated the risk to brothers and fathers of individuals with TGCT to be eight-to tenfold and four-to sixfold, respectively7, much higher than the familial risks for most other cancer classes, which are generally approximately twofold8. However, most families with multiple cases of TGCT include only two affected individuals, usually sibpairs, and extended pedigrees with several cases are exceedingly rare9. A genome-wide genetic linkage study of 179 families by an international consortium did not provide strong evidence for the location of a gene predisposing to TGCT9. However, candidate association studies have indicated that deletions on the Y chromosome that are also associated with infertility are implicated in TGCT susceptibility10.We carried out a genome-wide association study for TGCT susceptibility alleles using subjects with TGCT from the UK and the Illumina 370K array. Table 2 online). SNPs on chromosomes 5, 6 and 12 showed convincing evidence of association after replication (Table 1).The strongest evidence was obtained for rs995030 and rs1508595, which are located within the same LD block on chromosome 12. SNPs located in adjacent LD blocks showed much weaker evidence of association, suggesting that the causative variant resides within this block. In a multiple regression analysis, there was evidence that both rs995030 and rs1508595 are independently associated with disease risk (P = 0.03 in stage 2, P = 0.0006 overall, compare...
Crohn Disease (CD) is a complex genetic disorder for which more than 140 genes have been identified using genome wide association studies (GWAS). However, the genetic architecture of the trait remains largely unknown. The recent development of machine learning (ML) approaches incited us to apply them to classify healthy and diseased people according to their genomic information. The Immunochip dataset containing 18,227 CD patients and 34,050 healthy controls enrolled and genotyped by the international Inflammatory Bowel Disease genetic consortium (IIBDGC) has been re-analyzed using a set of ML methods: penalized logistic regression (LR), gradient boosted trees (GBT) and artificial neural networks (NN). The main score used to compare the methods was the Area Under the ROC Curve (AUC) statistics. The impact of quality control (QC), imputing and coding methods on LR results showed that QC methods and imputation of missing genotypes may artificially increase the scores. At the opposite, neither the patient/control ratio nor marker preselection or coding strategies significantly affected the results. LR methods, including Lasso, Ridge and ElasticNet provided similar results with a maximum AUC of 0.80. GBT methods like XGBoost, LightGBM and CatBoost, together with dense NN with one or more hidden layers, provided similar AUC values, suggesting limited epistatic effects in the genetic architecture of the trait. ML methods detected near all the genetic variants previously identified by GWAS among the best predictors plus additional predictors with lower effects. The robustness and complementarity of the different methods are also studied. Compared to LR, non-linear models such as GBT or NN may provide robust complementary approaches to identify and classify genetic markers.
The rs13387042 is associated with both ER-positive and ER-negative breast cancer in European women.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.