This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields. We develop a sure independence screening procedure based on the distance correlation (DC-SIS, for short). The DC-SIS can be implemented as easily as the sure independence screening procedure based on the Pearson correlation (SIS, for short) proposed by Fan and Lv (2008). However, the DC-SIS can significantly improve the SIS. Fan and Lv (2008) established the sure screening property for the SIS based on linear models, but the sure screening property is valid for the DC-SIS under more general settings including linear models. Furthermore, the implementation of the DC-SIS does not require model specification (e.g., linear model or generalized linear model) for responses or predictors. This is a very appealing property in ultrahigh dimensional data analysis. Moreover, the DC-SIS can be used directly to screen grouped predictor variables and for multivariate response variables. We establish the sure screening property for the DC-SIS, and conduct simulations to examine its finite sample performance. Numerical comparison indicates that the DC-SIS performs much better than the SIS in various models. We also illustrate the DC-SIS through a real data example.
We provide a novel and completely different approach to dimension-reduction problems from the existing literature. We cast the dimension-reduction problem in a semiparametric estimation framework and derive estimating equations. Viewing this problem from the new angle allows us to derive a rich class of estimators, and obtain the classical dimension reduction techniques as special cases in this class. The semiparametric approach also reveals that in the inverse regression context while keeping the estimation structure intact, the common assumption of linearity and/or constant variance on the covariates can be removed at the cost of performing additional nonparametric regression. The semiparametric estimators without these common assumptions are illustrated through simulation studies and a real data example. This article has online supplementary material.
Complex environmental conditions can significantly affect bacterial genome size by unknown mechanisms. The So0157-2 strain of Sorangium cellulosum is an alkaline-adaptive epothilone producer that grows across a wide pH range. Here, we show that the genome of this strain is 14,782,125 base pairs, 1.75-megabases larger than the largest bacterial genome from S. cellulosum reported previously. The total 11,599 coding sequences (CDSs) include massive duplications and horizontally transferred genes, regulated by lots of protein kinases, sigma factors and related transcriptional regulation co-factors, providing the So0157-2 strain abundant resources and flexibility for ecological adaptation. The comparative transcriptomics approach, which detected 90.7% of the total CDSs, not only demonstrates complex expression patterns under varying environmental conditions but also suggests an alkaline-improved pathway of the insertion and duplication, which has been genetically testified, in this strain. These results provide insights into and a paradigm for how environmental conditions can affect bacterial genome expansion.
IGF-I, a ubiquitous polypeptide, plays a key role in longitudinal bone growth and acquisition. The most predominant effect of skeletal IGF-I is acceleration of the differentiation program for osteoblasts. However, in vivo studies using recombinant human (rh) IGF-I and/or rhGH have demonstrated stimulation of both bone formation and resorption, thereby potentially limiting the usefulness of these peptides in the treatment of osteoporosis. In this study, we hypothesized that IGF-I modulates bone resorption by regulating expression of osteoprotegerin (OPG) and receptor activator of nuclear factor-kappaB (RANK) ligand (RANKL) in bone cells. Using Northern analysis in ST2 cells, we found that human IGF-I suppressed OPG mRNA in a time- and dose-dependent manner: 100 micro g/LIGF-I (13 nM) decreased OPG expression by 37.0 +/- 1.8% (P < 0.002). The half maximal inhibitory dose of IGF-I was reached at 50 micro g/liter ( approximately 6.5 nM) with no effect of IGF-I on OPG message stability. Conditioned media from ST2 cells confirmed that IGF-I decreased secreted OPG, reducing levels by 42%, from 12.1-7 ng/ml at 48 h (P < 0.05). Similarly, IGF-I at 100 micro g/liter (13 nM) increased RANKL mRNA expression to 353 +/- 74% above untreated cells as assessed by real-time PCR. In vivo, low doses of rhGH when administered to elderly postmenopausal women only modestly raised serum IGF-I (to concentrations of 18-26 nM) and did not affect circulating OPG concentrations; however, administration of rhIGF-I (30 micro g/kg.d) for 1 yr to older women resulted in a significant increase in serum IGF-I (to concentrations of 39-45 nM) and a 20% reduction in serum OPG (P < 0.05). In summary, we conclude that IGF-I in a dose- and time-dependent manner regulates OPG and RANKL in vitro and in vivo. These data suggest IGF-I may act as a coupling factor in bone remodeling by activating both bone formation and bone resorption; the latter effect appears to be mediated through the OPG/RANKL system in bone.
Summary Summarizing the effect of many covariates through a few linear combinations is an effective way of reducing covariate dimension and is the backbone of (sufficient) dimension reduction. Because the replacement of high-dimensional covariates by low-dimensional linear combinations is performed with a minimum assumption on the specific regression form, it enjoys attractive advantages as well as encounters unique challenges in comparison with the variable selection approach. We review the current literature of dimension reduction with an emphasis on the two most popular models, where the dimension reduction affects the conditional distribution and the conditional mean, respectively. We discuss various estimation and inference procedures in different levels of detail, with the intention of focusing on their underneath idea instead of technicalities. We also discuss some unsolved problems in this area for potential future research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.