Consider a linear model Y = Xβ + z, z ∼ N (0, In). Here, X = Xn,p, where both p and n are large, but p > n. We model the rows of X as i.i.d. samples from N (0, 1 n Ω), where Ω is a p × p correlation matrix, which is unknown to us but is presumably sparse. The vector β is also unknown but has relatively few nonzero coordinates, and we are interested in identifying these nonzeros.We propose the Univariate Penalization Screeing (UPS) for variable selection. This is a screen and clean method where we screen with univariate thresholding and clean with penalized MLE. It has two important properties: sure screening and separable after screening. These properties enable us to reduce the original regression problem to many small-size regression problems that can be fitted separately. The UPS is effective both in theory and in computation.We measure the performance of a procedure by the Hamming distance, and use an asymptotic framework where p → ∞ and other quantities (e.g., n, sparsity level and strength of signals) are linked to p by fixed parameters. We find that in many cases, the UPS achieves the optimal rate of convergence. Also, for many different Ω, there is a common three-phase diagram in the two-dimensional phase space quantifying the signal sparsity and signal strength. In the first phase, it is possible to recover all signals. In the second phase, it is possible to recover most of the signals, but not all of them. In the third phase, successful variable selection is impossible. UPS partitions the phase space in the same way that the optimal procedures do, and recovers most of the signals as long as successful variable selection is possible.The lasso and the subset selection are well-known approaches to variable selection. However, somewhat surprisingly, there are regions in the phase space where neither of them is rate optimal, even in very simple settings, such as Ω is tridiagonal, and when the tuning parameter is ideally set.
Summary In the past 20 years, research suggests that social workers and counselors are at risk of developing secondary traumatic stress from working with traumatized client populations. However, only a few studies have examined specific risk and protective factors that may buffer the social worker from developing secondary trauma symptoms. This article reports the results from a cross-sectional survey-based study of clinical social workers in which a predictive model of secondary traumatic stress was sought. In order to obtain an optimally predictive subset of variables from a larger set of candidate variables, this study employed a rigorous variable selection procedure using criteria-based methods for arriving at a final model predicting secondary trauma. Findings The results suggest that the ratings of the supervisory relationship, salary, caseload size, and personal anxiety may be salient factors that impact the development of secondary trauma among clinical social workers. Specifically, positive ratings of supervision and higher income level were found to predict a substantial decrease in the degree to which a social worker possessed secondary trauma symptoms. Applications Secondary trauma threatens clinician health, client quality of services, and contributes to increased financial burdens to nonprofit agencies due to burnout and employee turnover. At an organizational level, administrators and policymakers can address these problems by providing higher salaries, encouraging reasonable client caseloads, and ensuring that social workers have access to skilled clinical supervisors. At an individual level, personal self-care to reduce daily anxiety may be important to protect clinical social workers from developing secondary trauma symptoms.
We collected and cleaned a large data set on publications in statistics. The data set consists of the coauthor relationships and citation relationships of 83, 331 papers published in 36 representative journals in statistics, probability, and machine learning, spanning 41 years. The data set allows us to construct many different networks, and motivates a number of research problems about the research patterns and trends, research impacts, and network topology of the statistics community. In this paper we focus on (i) using the citation relationships to estimate the research interests of authors, and (ii) using the coauthor relationships to study the network topology.Using co-citation networks we constructed, we discover a "statistics triangle", reminiscent of the statistical philosophy triangle (Efron, 1998). We propose new approaches to constructing the "research map" of statisticians, as well as the "research trajectory" for a given author to visualize his/her research interest evolvement. Using coauthorship networks we constructed, we discover a multi-layer community tree and produce a Sankey diagram to visualize the author migrations in different sub-areas. We also propose several new metrics for research diversity of individual authors. We find that "Bayes", "Biostatistics", and "Nonparametric" are three primary areas in statistics. We also identify 15 sub-areas, each of which can be viewed as a weighted average of the primary areas, and identify several underlying reasons for the formation of co-authorship communities. We also find that the research interests of statisticians have evolved significantly in the 41-year time window we studied: some areas (e.g., biostatistics, high-dimensional data analysis, etc.) have become increasingly more popular. The research diversity of statisticians may be lower than we might have expected. For example, for the personalized networks of most authors, the p-values of the proposed significance tests are relatively large.
In this paper, we present CensNet, Convolution with Edge-Node Switching graph neural network, for semi-supervised classification and regression in graph-structured data with both node and edge features. CensNet is a general graph embedding framework, which embeds both nodes and edges to a latent feature space. By using line graph of the original undirected graph, the role of nodes and edges are switched, and two novel graph convolution operations are proposed for feature propagation. Experimental results on real-world academic citation networks and quantum chemistry graphs show that our approach has achieved or matched the state-of-the-art performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.