We examined Differential Item Functioning (DIF) and the size of cross-cultural performance differences in the Programme for International Student Assessment (PISA) 2012 mathematics data before and after application of propensity score matching. The mathematics performance of Indonesian, Turkish, Australian, and Dutch students on released items was compared. Matching variables were gender, an index of economic, social and cultural status, and opportunity to learn, in exact, nearest neighbor, and optimal matching. Logistic regression and structural equation modeling were used to identify DIF. If propensity scores were used in the DIF analyses as performance predictors, much less DIF was found than in the original data; similarly, when in tests of country differences in mathematics performance, propensity scores were used as covariates, effect sizes of tests of country differences were reduced substantially. We concluded that propensity scoring provided us with a new tool to better control sources of DIF and country differences in PISA mathematics performance.