We developed a simple algorithm, i-Score (inhibitory-Score), to predict active siRNAs by applying a linear regression model to 2431 siRNAs. Our algorithm is exclusively comprised of nucleotide (nt) preferences at each position, and no other parameters are taken into account. Using a validation dataset comprised of 419 siRNAs, we found that the prediction accuracy of i-Score is as good as those of s-Biopredsi, ThermoComposition21 and DSIR, which employ a neural network model or more parameters in a linear regression model. Reynolds and Katoh also predict active siRNAs efficiently, but the numbers of siRNAs predicted to be active are less than one-eighth of that of i-Score. We additionally found that exclusion of thermostable siRNAs, whose whole stacking energy (ΔG) is less than −34.6 kcal/mol, improves the prediction accuracy in i-Score, s-Biopredsi, ThermoComposition21 and DSIR. We also developed a universal target vector, pSELL, with which we can assay an siRNA activity of any sequence in either the sense or antisense direction. We assayed 86 siRNAs in HEK293 cells using pSELL, and validated applicability of i-Score and the whole ΔG value in designing siRNAs.
We have found that two previously reported exonic mutations in the PINK1 and PARK7 genes affect pre-mRNA splicing. To develop an algorithm to predict underestimated splicing consequences of exonic mutations at the 5′ splice site, we constructed and analyzed 31 minigenes carrying exonic splicing mutations and their derivatives. We also examined 189 249 U2-dependent 5′ splice sites of the entire human genome and found that a new variable, the SD-Score, which represents a common logarithm of the frequency of a specific 5′ splice site, efficiently predicts the splicing consequences of these minigenes. We also employed the information contents (Ri) to improve the prediction accuracy. We validated our algorithm by analyzing 32 additional minigenes as well as 179 previously reported splicing mutations. The SD-Score algorithm predicted aberrant splicings in 198 of 204 sites (sensitivity = 97.1%) and normal splicings in 36 of 38 sites (specificity = 94.7%). Simulation of all possible exonic mutations at positions −3, −2 and −1 of the 189 249 sites predicts that 37.8, 88.8 and 96.8% of these mutations would affect pre-mRNA splicing, respectively. We propose that the SD-Score algorithm is a practical tool to predict splicing consequences of mutations affecting the 5′ splice site.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.