The success of preclinical research hinges on exploratory and confirmatory animal studies. Traditional null hypothesis significance testing is a common approach to eliminate the chaff from a collection of drugs, so that only the most promising treatments are funneled through to clinical research phases. Balancing the number of false discoveries and false omissions is an important aspect to consider during this process. In this paper, we compare several preclinical research pipelines, either based on null hypothesis significance testing or based on Bayesian statistical decision criteria. We build on a recently published large-scale meta-analysis of reported effect sizes in preclinical animal research and elicit a non-informative prior distribution under which both approaches are compared. After correcting for publication bias and shrinkage of effect sizes in replication studies, simulations show that (i) a shift towards statistical approaches which explicitly incorporate the minimum clinically important difference reduces the false discovery rate of frequentist approaches and (ii) a shift towards Bayesian statistical decision criteria can improve the reliability of preclinical animal research by reducing the number of false-positive findings. It is shown that these benefits hold while keeping the number of experimental units low which are required for a confirmatory follow-up study. Results show that Bayesian statistical decision criteria can help in improving the reliability of preclinical animal research and should be considered more frequently in practice.