Many variants of uncertain significance (VUS) have been identified in BRCA2 through clinical genetic testing. VUS pose a significant clinical challenge because the contribution of these variants to cancer risk has not been determined. We conducted a comprehensive assessment of VUS in the BRCA2 C-terminal DNA binding domain (DBD) by using a validated functional assay of BRCA2 homologous recombination (HR) DNA-repair activity and defined a classifier of variant pathogenicity. Among 139 variants evaluated, 54 had ?99% probability of pathogenicity, and 73 had ?95% probability of neutrality. Functional assay results were compared with predictions of variant pathogenicity from the Align-GVGD protein-sequence-based prediction algorithm, which has been used for variant classification. Relative to the HR assay, Align-GVGD significantly (p < 0.05) over-predicted pathogenic variants. We subsequently combined functional and Align-GVGD prediction results in a Bayesian hierarchical model (VarCall) to estimate the overall probability of pathogenicity for each VUS. In addition, to predict the effects of all other BRCA2 DBD variants and to prioritize variants for functional studies, we used the endoPhenotype-Optimized Sequence Ensemble (ePOSE) algorithm to train classifiers for BRCA2 variants by using data from the HR functional assay. Together, the results show that systematic functional assays in combination with in silico predictors of pathogenicity provide robust tools for clinical annotation of BRCA2 VUS.
Sequencing-based genetic tests to identify individuals at increased risk of hereditary breast and ovarian cancers have resulted in the identification of more than 40,000 sequence variants of BRCA1 and BRCA2. A majority of these variants are considered to be variants of uncertain significance (VUS) because their impact on disease risk remains unknown, largely due to lack of sufficient familial linkage and epidemiological data. Several assays have been developed to examine the effect of VUS on protein function, which can be used to assess their impact on cancer susceptibility. In this study, we report the functional characterization of 88 BRCA2 variants, including several previously uncharacterized variants, using a well-established mouse embryonic stem cell (mESC)-based assay. We have examined their ability to rescue the lethality of Brca2 null mESC as well as sensitivity to six DNA damaging agents including ionizing radiation and a PARP inhibitor. We have also examined the impact of BRCA2 variants on splicing. In addition, we have developed a computational model to determine the probability of impact on function of the variants that can be used for risk assessment. In contrast to the previous VarCall models that are based on a single functional assay, we have developed a new platform to analyze the data from multiple functional assays separately and in combination. We have validated our VarCall models using 12 known pathogenic and 10 neutral variants and demonstrated their usefulness in determining the pathogenicity of BRCA2 variants that are listed as VUS or as variants with conflicting functional interpretation.
We describe the development and application of a Bayesian statistical model for the prior probability of phenotype-genotype association that incorporates data from past association studies and publicly available functional annotation data regarding the susceptibility variants under study. The model takes the form of a binary regression of association status on a set of annotation variables whose coefficients were estimated through an analysis of associated SNPs housed in the GWAS Catalog (GC). The set of functional predictors we examined includes measures that have been demonstrated to correlate with the association status of SNPs in the GC and some whose utility in this regard is speculative: summaries of the UCSC Human Genome Browser ENCODE super-track data, dbSNP function class, sequence conservation summaries, proximity to genomic variants included in the Database of Genomic Variants (DGV) and known regulatory elements included in the Open Regulatory Annotation database (ORegAnno), PolyPhen-2 probabilities and RegulomeDB categories. Because we expected that only a fraction of the annotation variables would contribute to predicting association, we employed a penalized likelihood method to reduce the impact of non-informative predictors and evaluated the model's ability to predict GC SNPs not used to construct the model. We show that the functional data alone are predictive of a SNP's presence in the GC. Further, using data from a genome-wide study of ovarian cancer, we demonstrate that their use as prior data when testing for association is practical at the genome-wide scale and improves power to detect associations.
BackgroundGenetic association studies are conducted to discover genetic loci that contribute to an inherited trait, identify the variants behind these associations and ascertain their functional role in determining the phenotype. To date, functional annotations of the genetic variants have rarely played more than an indirect role in assessing evidence for association. Here, we demonstrate how these data can be systematically integrated into an association study’s analysis plan.ResultsWe developed a Bayesian statistical model for the prior probability of phenotype–genotype association that incorporates data from past association studies and publicly available functional annotation data regarding the susceptibility variants under study. The model takes the form of a binary regression of association status on a set of annotation variables whose coefficients were estimated through an analysis of associated SNPs in the GWAS Catalog (GC). The functional predictors examined included measures that have been demonstrated to correlate with the association status of SNPs in the GC and some whose utility in this regard is speculative: summaries of the UCSC Human Genome Browser ENCODE super–track data, dbSNP function class, sequence conservation summaries, proximity to genomic variants in the Database of Genomic Variants and known regulatory elements in the Open Regulatory Annotation database, PolyPhen–2 probabilities and RegulomeDB categories. Because we expected that only a fraction of the annotations would contribute to predicting association, we employed a penalized likelihood method to reduce the impact of non–informative predictors and evaluated the model’s ability to predict GC SNPs not used to construct the model. We show that the functional data alone are predictive of a SNP’s presence in the GC. Further, using data from a genome–wide study of ovarian cancer, we demonstrate that their use as prior data when testing for association is practical at the genome–wide scale and improves power to detect associations.ConclusionsWe show how diverse functional annotations can be efficiently combined to create ‘functional signatures’ that predict the a priori odds of a variant’s association to a trait and how these signatures can be integrated into a standard genome–wide–scale association analysis, resulting in improved power to detect truly associated variants.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-398) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.