Short interfering RNAs are used in functional genomics studies to knockdown a single gene in a reversible manner. The results of siRNA experiments are highly dependent on the choice of siRNA sequence. In order to evaluate siRNA design rules, we collected a database of 398 siRNAs of known efficacy from 92 genes. We used this database to evaluate previously proposed rules from smaller datasets, and to find a new set of rules that are optimal for the entire database. We also trained a regression tree with full crossvalidation. It was however difficult to obtain the same precision as methods previously tested on small datasets from one or two genes. We show that those methods are overfitting as they work poorly on independent validation datasets from multiple genes. Our new design rules can predict siRNAs with efficacy P50% in 91% of cases, and with efficacy P90% in 52% of cases, which is more than a twofold improvement over random selection. Software for designing siRNAs is available online via a web server at http:// sisearch.cgb.ki.se/ or as a standalone version for high-throughput applications. Ó 2004 Elsevier Inc. All rights reserved.RNAi is a recently discovered biological phenomenon whereby a single gene may be inhibited at the RNA stage of synthesis. short interfering RNAs (siRNAs) are duplexes of two RNA molecules, typically 21-mers with a 2nt 3 0 overhang [1]. One strand is loaded into the RISC complex [2] after which a sequence specific cleavage of the target takes place. The strength of the principle behind any form of RNA targeting (siRNA and antisense) is that the molecule can be used to inhibit expression of any mRNA, and thus the protein it encodes. This effect can be demonstrated without affecting related proteins, making it an invaluable tool for functional genomics. siRNAs have been found to be effective in Arabidopsis thaliana, Drosophila melanogaster, Caenorhabditis elegans, and mammals [3]. Many reviews of the biological processes behind siRNA inhibition exist, see for example [3,4].Efficient reliable design of high efficacy siRNA molecules is essential to meet the needs for cost-effective high-throughput functional genomics projects. To meet these needs the siRNAs designed should at least conform to the following criteria: (a) be predicted with high accuracy, (b) be sequence specific, and (c) be produced in a form that facilitates high-throughput production. In this paper we address these criteria, focusing primarily on a and b.Despite the apparent ease of designing siRNAs (compared to a popular DNA-based knockdown technique: antisense oligonucleotides (AOs)), a number of problems still remain. Randomly selected siRNAs produce knockdown P50% with 58-78% success rate, while very effective siRNAs ( P90/95%) are found by chance 11-18% of the time [5,6].Initial design paradigms for siRNAs were based on motif rules, such as AAN (19)