The asymptotic distribution of the linear instrumental variables (IV) estimator with empirically selected ridge regression penalty is characterized. The regularization tuning parameter is selected by splitting the observed data into training and test samples and becomes an estimated parameter that jointly converges with the parameters of interest. The asymptotic distribution is a nonstandard mixture distribution. Monte Carlo simulations show the asymptotic distribution captures the characteristics of the sampling distributions and when this ridge estimator performs better than two-stage least squares. An empirical application on returns to education data is presented.
In the last decade, the use of simple rating and comparison surveys has proliferated on social and digital media platforms to fuel recommendations. These simple surveys and their extrapolation with machine learning algorithms like matrix factorization shed light on user preferences over large and growing pools of items, such as movies, songs and ads. Social scientists have a long history of measuring perceptions, preferences and opinions, often over smaller, discrete item sets with exhaustive rating or ranking surveys. This paper introduces simple surveys for social science application. We ran experiments to compare the predictive accuracy of both individual and aggregate comparative assessments using four types of simple surveys -pairwise comparisons and ratings on 2, 5 and continuous point scales in three distinct contexts -perceived Safety of Google Streetview Images, Likeability of Artwork, and Hilarity of Animal GIFs. Across contexts, we find that continuous scale ratings best predict individual assessments but consume the most time and cognitive effort. Binary choice surveys are quick and perform best to predict aggregate assessments, useful for collective decision tasks, but poorly predict personalized preferences, for which they are currently used by Netflix to recommend movies. Pairwise comparisons, by contrast, perform well to predict personal assessments, but poorly predict aggregate assessments despite being widely used to crowdsource ideas and collective preferences. We also demonstrate how findings from these surveys can be visualized in a low-dimensional space that reveals distinct respondent interpretations of questions asked in each context. We conclude by reflecting on differences between sparse, incomplete 'simple surveys' and their traditional survey counterparts in terms of efficiency, information elicited and settings in which knowing less about more may be critical for social science. is Professor at the Toyota Technological Institution of Chicago, and part time faculty in Computer Science and the Committee on Computational and Applied Mathematics at the University of Chicago. Srebro is interested in statistical and computational aspects of machine learning and their interaction. He has done theoretical work in statistical learning theory and in algorithms, devised novel learning models and optimization techniques, and has worked on applications in computational biology, text analysis, collaborative filtering and social science. Srebro obtained his PhD from the Massachusetts Institute of Technology in 2004.
The asymptotic distribution is presented for the linear instrumental variables model estimated with a ridge penalty and a prior where the tuning parameter is selected with a holdout sample. The structural parameters and the tuning parameter are estimated jointly by method of moments. A chi-squared statistic permits confidence regions for the structural parameters. The form of the asymptotic distribution provides insights on the optimal way to perform the split between the training and test sample. Results for the linear regression estimated by ridge regression are presented as a special case.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.