Variant interpretation is the key issue in molecular diagnosis. Spliceogenic variants exemplify this issue as each nucleotide variant can be deleterious via disruption or creation of splice site consensus sequences. Consequently, reliable in silico prediction of variant spliceogenicity would be a major improvement. Thanks to an international effort, a set of 395 variants studied at the mRNA level and occurring in 5′ and 3′ consensus regions (defined as the 11 and 14 bases surrounding the exon/intron junction, respectively) was collected for 11 different genes, including BRCA1, BRCA2, CFTR and RHD, and used to train and validate a new prediction protocol named Splicing Prediction in Consensus Elements (SPiCE). SPiCE combines in silico predictions from SpliceSiteFinder-like and MaxEntScan and uses logistic regression to define optimal decision thresholds. It revealed an unprecedented sensitivity and specificity of 99.5 and 95.2%, respectively, and the impact on splicing was correctly predicted for 98.8% of variants. We therefore propose SPiCE as the new tool for predicting variant spliceogenicity. It could be easily implemented in any diagnostic laboratory as a routine decision making tool to help geneticists to face the deluge of variants in the next-generation sequencing era. SPiCE is accessible at (https://sourceforge.net/projects/spicev2-1/).
It has long been known that canonical 5′ splice site (5′SS) GT>GC variants may be compatible with normal splicing. However, to date, the actual scale of canonical 5′SSs capable of generating wild‐type transcripts in the case of GT>GC substitutions remains unknown. Herein, combining data derived from a meta‐analysis of 45 human disease‐causing 5′SS GT>GC variants and a cell culture‐based full‐length gene splicing assay of 103 5′SS GT>GC substitutions, we estimate that ~15–18% of canonical GT 5′SSs retain their capacity to generate between 1% and 84% normal transcripts when GT is substituted by GC. We further demonstrate that the canonical 5′SSs in which substitution of GT by GC‐generated normal transcripts exhibit stronger complementarity to the 5′ end of U1 snRNA than those sites whose substitutions of GT by GC did not lead to the generation of normal transcripts. We also observed a correlation between the generation of wild‐type transcripts and a milder than expected clinical phenotype but found that none of the available splicing prediction tools were capable of reliably distinguishing 5′SS GT>GC variants that generated wild‐type transcripts from those that did not. Our findings imply that 5′SS GT>GC variants in human disease genes may not invariably be pathogenic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.