Discriminant validity was originally presented as a set of empirical criteria that can be assessed from multitrait-multimethod (MTMM) matrices. Because datasets used by applied researchers rarely lend themselves to MTMM analysis, the need to assess discriminant validity in empirical research has led to the introduction of numerous techniques, some of which have been introduced in an ad hoc manner and without rigorous methodological support. We review various definitions of and techniques for assessing discriminant validity and provide a generalized definition of discriminant validity based on the correlation between two measures after measurement error has been considered. We then review techniques that have been proposed for discriminant validity assessment, demonstrating some problems and equivalencies of these techniques that have gone unnoticed by prior research. After conducting Monte Carlo simulations that compare the techniques, we present techniques called CICFA(sys) and [Formula: see text](sys) that applied researchers can use to assess discriminant validity.
Partial least squares path modeling (PLS) was developed in the 1960s and 1970s as a method for predictive modeling. In the succeeding years, applied disciplines, including organizational and management research, have developed beliefs about the capabilities of PLS and its suitability for different applications. On close examination, some of these beliefs prove to be unfounded and to bear little correspondence to the actual capabilities of PLS. In this article, we critically examine several of these commonly held beliefs. We describe their origins, and, using simple examples, we demonstrate that many of these beliefs are not true. We conclude that the method is widely misunderstood, and our results cast strong doubts on its effectiveness for building and testing theory in organizational research.
Statistical and methodological myths and urban legends a b s t r a c t Partial least squares (PLS) path modeling is increasingly being promoted as a technique of choice for various analysis scenarios, despite the serious shortcomings of the method. The current lack of methodological justification for PLS prompted the editors of this journal to declare that research using this technique is likely to be deck-rejected (Guide and Ketokivi, 2015). To provide clarification on the inappropriateness of PLS for applied research, we provide a non-technical review and empirical demonstration of its inherent, intractable problems. We show that although the PLS technique is promoted as a structural equation modeling (SEM) technique, it is simply regression with scale scores and thus has very limited capabilities to handle the wide array of problems for which applied researchers use SEM. To that end, we explain why the use of PLS weights and many rules of thumb that are commonly employed with PLS are unjustifiable, followed by addressing why the touted advantages of the method are simply untenable.
Entities such as individuals, teams, or organizations can vary systematically from one another. Researchers typically model such data using multilevel models, assuming that the random effects are uncorrelated with the regressors. Violating this testable assumption, which is often ignored, creates an endogeneity problem thus preventing causal interpretations. Focusing on two-level models, we explain how researchers can avoid this problem by including cluster means of the Level 1 explanatory variables as controls; we explain this point conceptually and with a large-scale simulation. We further show why the common practice of centering the predictor variables is mostly unnecessary. Moreover, to examine the state of the science, we reviewed 204 randomly drawn articles from macro and micro organizational science and applied psychology journals, finding that only 106 articles—with a slightly higher proportion from macro-oriented fields—properly deal with the random effects assumption. Alarmingly, most models also failed on the usual exogeneity requirement of the regressors, leaving only 25 mostly macro-level articles that potentially reported trustworthy multilevel estimates. We offer a set of practical recommendations for researchers to model multilevel data appropriately.
Confidence intervals (CIs) are an alternative to null hypothesis significance testing (NHST) (Nickerson 2000; Wood 2005); both techniques essentially convey information about how certain we are of an estimate. A CI consists of two values, upper and lower confidence limits, associated with a confidence level, typically 95%. Valid CIs should satisfy two important conditions (DiCiccio and Efron 1996). First, a CI should contain the population value of the parameter under estimation with the stated degree of confidence over a large number of repeated samples. For example, a 95% CI should contain the population value of a parameter 95% of the time, with the true value of the parameter falling outside of the interval only in 5% of the cases. Second, in those cases where the true value of the parameter falls beyond the boundaries of the interval, it should do so in a balanced way. Using again the 95% CI as an example, the population value should be higher than the upper boundary in 2.5% of the samples and lower than the lower boundary of the interval 2.5% of the time. We illustrate these two properties of CIs in Figure A1. The figure shows 250 correlation estimates drawn from three different populations (where the true value of the correlation is 0, 0.3, or 0.6, respectively), ordered from smallest to largest, and their 95% CIs. In this scenario, the population value falls outside the CI only about 5% of the time and does so in a balanced way, such that the population value lies above the CI 2.5% of the time and below the CI 2.5% of the time. The figure also shows that both the variance of the estimates and the width of the CIs depend on the population value of the correlation; when the population correlation is zero, the difference between the largest and smallest estimate is close to 0.5, but when the population value is 0.6 this difference decreases to about 0.35. Similarly, the CIs are narrower for larger estimates. This is an important feature of CIs that unfortunately complicates their calculation, as we discuss later. The CI of a correlation has a valid closed form solution, but estimating CIs for more complex scenarios is a non-trivial problem. The most straightforward way to estimate CIs is to use a known theoretical distribution. We refer to these as parametric approaches. When the distribution of the estimates is not known, as is the case with those obtained from PLSc, CIs based on bootstrapping provide an attractive alternative (Wood 2005). In these approaches, which we refer to as empirical, the endpoints of the CIs are not taken from a known statistical distribution, but rather the values are obtained from the empirically approximated bootstrap distribution. Bootstrapping means that we draw a large number of samples from our original data and calculate the statistic for each sample. The samples are drawn with replacement, which means that each observation in the original sample can be included in each bootstrap sample multiple times. While bootstrapping can be useful when working with statistics whose samp...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.