In 2015, the Open Science Collaboration repeated a series of 100 psychological experiments. Since a considerable part of these replications could not confirm the original effects and some of them pointed in the opposite direction, psychological research is said to lack reproducibility. Several general criticisms can explain this finding, such as the standardized use of undirected nil-null hypothesis tests, samples being too small and selective, lack of corrections for multiple testing, but also some widespread questionable research practices and incentives to publish positive results only. A selection of 57,909 articles from 12 renowned journals is processed with the JATSdecoder software to analyze the extent to which several empirical research practices in psychology have changed over the past 12 years. To identify journal- and time-specific changes, the relative use of statistics based on p-values, the number of reported p-values per paper, the relative use of confidence intervals, directed tests, power analysis, Bayesian procedures, non-standard α levels, correction procedures for multiple testing, and median sample sizes are analyzed for articles published between 2010 and 2015 and after 2015, and in more detail for every included journal and year of publication. In addition, the origin of authorships is analyzed over time. Compared to articles that were published in and before 2015, the median number of reported p-values per article has decreased from 14 to 12, whereas the median proportion of significant p-values per article remained constant at 69%. While reports of effect sizes and confidence intervals have increased, the α level is usually set to the default value of .05. The use of corrections for multiple testing has decreased. Although uncommon in each case (4% in total), directed testing is used less frequently, while Bayesian inference has become more common after 2015. The overall median estimated sample size has increased from 105 to 190.