As more and more health systems have converted to the use of electronic health records, the amount of searchable and analyzable data is exploding. This includes not just provider or laboratory created data but also data collected by instruments, personal devices, and patients themselves, among others. This has led to more attention being paid to the analysis of these data to answer previously unaddressed questions. This is especially important given the number of therapies previously found to be beneficial in clinical trials that are currently being re-scrutinized. Because there are orders of magnitude more information contained in these data sets, a fundamentally different approach needs to be taken to their processing and analysis and the generation of knowledge. Health care and medicine are drivers of this phenomenon and will ultimately be the main beneficiaries. Concurrently, many different types of questions can now be asked using these data sets. Research groups have become increasingly active in mining large data sets, including nationwide health care databases, to learn about associations of medication use and various unrelated diseases such as cancer. Given the recent increase in research activity in this area, its promise to radically change clinical research, and the relative lack of widespread knowledge about its potential and advances, we surveyed the available literature to understand the strengths and limitations of these new tools. We also outline new databases and techniques that are available to researchers worldwide, with special focus on work pertaining to the broad and rapid monitoring of drug safety and secondary effects.