After market launch, new information on adverse effects of medicinal products is almost exclusively first highlighted by spontaneous reporting. As data sets of spontaneous reports have become larger, and computational capability has increased, quantitative methods have been increasingly applied to such data sets. The screening of such data sets is an application of knowledge discovery in databases (KDD). Effective KDD is an iterative and interactive process made up of the following steps: developing an understanding of an application domain, creating a target data set, data cleaning and pre-processing, data reduction and projection, choosing the data mining task, choosing the data mining algorithm, data mining, interpretation of results and consolidating and using acquired knowledge. The process of KDD as it applies to the analysis of spontaneous reports can be exemplified by its routine use on the 3.5 million suspected adverse drug reaction (ADR) reports in the WHO ADR database. Examples of new adverse effects first highlighted by the KDD process on WHO data include topiramate glaucoma, infliximab vasculitis and the association of selective serotonin reuptake inhibitors (SSRIs) and neonatal convulsions. The KDD process has already improved our ability to highlight previously unsuspected ADRs for clinical review in spontaneous reporting, and we anticipate that such techniques will be increasingly used in the successful screening of other healthcare data sets such as patient records in the future.