We analyse response patterns to an important survey of school children, exploiting rich auxiliary information on respondents' and non-respondents' cognitive ability that is correlated both with response and the learning achievement that the survey aims to measure. The survey is the Programme for International Student Assessment (PISA), which sets response thresholds in an attempt to control data quality. We analyse the case of England for 2000 when response rates were deemed high enough by the PISA organisers to publish the results, and 2003, when response rates were a little lower and deemed of sufficient concern for the results not to be published. We construct weights that account for the pattern of non-response using two methods, propensity scores and the GREG estimator. There is clear evidence of biases, but there is no indication that the slightly higher response rates in 2000 were associated with higher quality data. This underlines the danger of using response rate thresholds as a guide to data quality.Key words: non-response, bias, school survey, data linkage, PISA.
AcknowledgementsThis research was funded by ESRC grant RES-062-23-0458 ('Hierarchical analysis of unit non-response in sample surveys'). It builds on earlier work funded by the (then) Department for Education and Skills (DfES). We thank staff at DfES, the Office for National Statistics, and the Fischer Family Trust for their help in setting up and interpreting the data. We alone remain responsible for their use. We are grateful for comments to seminar and conference participants at DfES, the ESRC Research Methods Festival, the International Association for Research on Income and Wealth, the European University Institute, Southampton, and Bremen. Micklewright thanks the Institute for Research on Poverty at the University of Wisconsin-Madison for hospitality during a sabbatical visit when he worked on the paper. We thank the editors and referees for very helpful suggestions.2