The validity of preclinical studies of candidate therapeutic agents has been questioned given their limited ability to predict their fate in clinical development, including due to design flaws and reporting bias. In this study, we examined this issue in depth by conducting a meta-analysis of animal studies investigating the efficacy of the clinically approved kinase inhibitor, sorafenib. MEDLINE, Embase, and BIOSIS databases were searched for all animal experiments testing tumor volume response to sorafenib monotherapy in any cancer published until April 20, 2012. We estimated effect sizes from experiments assessing changes in tumor volume and conducted subgroup analyses based on prespecified experimental design elements associated with internal, construct, and external validity. The meta-analysis included 97 experiments involving 1,761 animals. We excluded 94 experiments due to inadequate reporting of data. Design elements aimed at reducing internal validity threats were implemented only sporadically, with 66% reporting animal attrition and none reporting blinded outcome assessment or concealed allocation. Anticancer activity against various malignancies was typically tested in only a small number of model systems. Effect sizes were significantly smaller when sorafenib was tested against either a different active agent or combination arm. Trim and fill suggested a 37% overestimation of effect sizes across all malignancies due to publication bias. We detected a moderate dose-response in one clinically approved indication, hepatocellular carcinoma, but not in another approved malignancy, renal cell carcinoma, or when data were pooled across all malignancies tested. In support of other reports, we found that few preclinical cancer studies addressed important internal, construct, and external validity threats, limiting their clinical generalizability. Our findings reinforce the need to improve guidelines for the design and reporting of preclinical cancer studies.