Solubility of organic compounds in DMSO is an important issue for commercial and academic organizations handling large compound collections or performing biological screening. In particular, solubility data are critical for the optimization of storage conditions and for the selection of compounds for bioscreening compatible with the assay protocol. Solubility is largely determined by the solvation energy and the crystal disruption energy, and these molecular phenomena should be assessed in structure-solubility correlation studies. The authors summarize our long-term experimental observations and theoretical studies of physicochemical determinants of DMSO solubility of organic substances. They compiled a comprehensive reference database of proprietary data on compound solubility (55,277 compounds with good DMSO solubility and 10,223 compounds with poor DMSO solubility), calculated specific molecular descriptors (topological, electromagnetic, charge, and lipophilicity parameters), and applied an advanced machine-learning approach for training neural networks to address the solubility. Both supervised (feed-forward, back-propagated neural networks) and unsupervised (Kohonen neural networks) learning methods were used. The resulting neural network models were validated by successfully predicting DMSO solubility of compounds in independent test selections. (Journal of Biomolecular Screening 2004:22-31)