In most of the regression problems the first task is to select the most influential predictors explaining the response, and removing the others from the model. These problems are usually referred to as the variable selection problems in the statistical literature. Numerous methods have been proposed in this field, most of which address linear models. In this study we propose two variable selection criteria for regression based on two powerful dependence measures, maximal correlation and distance correlation. We focus on these two measures since they fully or partially satisfy the Rényi postulates for dependence measures, and thus they are able to detect nonlinear dependence structures. Therefore, our methods are considered to be appropriate in linear as well as nonlinear regression models. Both methods are easy to implement and they perform well. We illustrate the performances of the proposed methods via simulations, and compare them with two benchmark methods, stepwise Akaike information criterion and lasso. In several cases with linear dependence all four methods turned out to be comparable. In the presence of nonlinear or uncorrelated dependencies, we observed that our proposed methods may be favourable. An application of the proposed methods to a real financial data set is also provided.
Cataloged from PDF version of article.Maximal correlation has several desirable properties as a measure of dependence, including the fact that it vanishes if and only if the variables are independent. Except for a few special cases, it is hard to evaluate maximal correlation explicitly. We focus on two-dimensional contingency tables and discuss a procedure for estimating maximal correlation, which we use for constructing a test of independence. We compare the maximal correlation test with other tests of independence by Monte Carlo simulations. When the underlying continuous variables are dependent but uncorrelated, we point out some cases for which the new test is more powerful
Cognitive Social Structure (CSS) network studies collect relational data on respondents' direct ties and their perception of ties among all other individuals in the network. When reporting their perception networks, respondents commit two types of errors, namely, omission (false negatives) and commission (false positives) errors. We first assess the relationship between these two error types, and their contributions on the overall respondent accuracy. Next we propose a method for estimating networks based on perceptions of a random sample of respondents from a bounded social network, which utilizes the Receiving Operator Characteristic
(ROC) curve for balancing the tradeoffs between omission and commission errors.A comparative numerical study shows that the proposed estimation method performs well. This new method can be easily integrated to organization studies that use randomized surveys to study multiple organizations. The burgeoning field of multilevel analysis of inter-organizational networks can also immensely benefit from this approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.