The last several years have seen a growth in the number of publications in economics that use principal component analysis (PCA) in the area of welfare studies. This paper explores the ways discrete data can be incorporated into PCA. The effects of discreteness of the observed variables on the PCA are reviewed. The statistical properties of the popular Filmer and Pritchett (2001) procedure are analyzed. The concepts of polychoric and polyserial correlations are introduced with appropriate references to the existing literature demonstrating their statistical properties. A large simulation study is carried out to compare various implementations of discrete data PCA. The simulation results show that the currently used method of running PCA on a set of dummy variables as proposed by Filmer and Pritchett (2001) can be improved upon by using procedures appropriate for discrete data, such as retaining the ordinal variables without breaking them into a set of dummy variables or using polychoric correlations. An empirical example using Bangladesh 2000 Demographic and Health Survey data helps in explaining the differences between procedures.
In this article, I discuss the main approaches to resampling variance estimation in complex survey data: balanced repeated replication, the jackknife, and the bootstrap. Balanced repeated replication and the jackknife are implemented in the Stata svy suite. The bootstrap for complex survey data is implemented by the bsweights command. I describe this command and provide working examples. Editors' note. This article was submitted and accepted before the new svy bootstrap prefix was made available in the Stata 11.1 update. The variance estimation method implemented in the new svy bootstrap prefix is equivalent to the one in bs4rw. The only real difference is syntax. For example,. bs4rw, rw(bw*): logistic highbp height weight age female [pw=finalwgt] is equivalent to. svyset [pw=finalwgt], vce(bootstrap) bsrweight(bw*). svy: logistic highbp height weight age female Similarly, the example using mean bootstrap replicate weights,. local mean2fay = 1-sqrt(1/10). svyset [pw=finalwgt], vce(brr) brrweight(bw*) fay(`mean2fay´). svy: logistic highbp height weight age female is equivalent to. svyset [pw=finalwgt], vce(bootstrap) bsrweight(bw*) bsn(10). svy: logistic highbp height weight age female The weights created by the bsweights command discussed in this article are equally applicable with the bs4rw command and with the new vce(bootstrap) and bsrweight() options of svy and svyset.
In this article, I introduce the ipfraking package, which implements weight-calibration procedures known as iterative proportional fitting, or raking, of complex survey weights. The package can handle a large number of control variables and trim the weights in various ways. It also provides diagnostic tools for the weights it creates. I provide examples of its use and a suggested workflow for creating raked replicate weights.
This paper applies a new decomposition technique to the study of variations in poverty across the regions of Russia. The procedure, which is based on the Shapley value in cooperative game theory, allows the deviation in regional poverty levels from the allRussia average to be attributed to three proximate sources; mean income per capita, inequality, and local prices. Contrary to expectation, regional poverty variations turn out to be due more to differences in inequality across regions than to differences in real income per capita. However, when real income per capita is split into nominal income and price components, differences in nominal incomes emerge as more important than either inequality or price effects for the majority of regions.
Recently, R. D. Stoel, F. G. Garre, C. Dolan, and G. van den Wittenboer (2006) reviewed approaches for obtaining reference mixture distributions for difference tests when a parameter is on the boundary. The authors of the present study argue that this methodology is incomplete without a discussion of when the mixtures are needed and show that they only become relevant when constrained difference tests are conducted. Because constrained difference tests can hide important model misspecification, a reliable way to assess global model fit under constrained estimation would be needed. Examination of the options for assessing model fit under constrained estimation reveals that no perfect solutions exist, although the conditional approach of releasing a degree of freedom for each active constraint appears to be the most methodologically sound one. The authors discuss pros and cons of constrained and unconstrained estimation and their implementation in 5 popular structural equation modeling packages and argue that unconstrained estimation is a simpler method that is also more informative about sources of misfit. In practice, researchers will have trouble conducting constrained difference tests appropriately, as this requires a commitment to ignore Heywood cases. Consequently, mixture distributions for difference tests are rarely appropriate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.