LASSO is a popular statistical tool often used in conjunction with generalized linear models that can simultaneously select variables and estimate parameters. When there are many variables of interest, as in current biological and biomedical studies, the power of LASSO can be limited. Fortunately, so much biological and biomedical data have been collected and they may contain useful information about the importance of certain variables. This paper proposes an extension of LASSO, namely, prior LASSO (pLASSO), to incorporate that prior information into penalized generalized linear models. The goal is achieved by adding in the LASSO criterion function an additional measure of the discrepancy between the prior information and the model. For linear regression, the whole solution path of the pLASSO estimator can be found with a procedure similar to the Least Angle Regression (LARS). Asymptotic theories and simulation results show that pLASSO provides significant improvement over LASSO when the prior information is relatively accurate. When the prior information is less reliable, pLASSO shows great robustness to the misspecification. We illustrate the application of pLASSO using a real data set from a genome-wide association study.
Objective To determine whether self-reported menopausal symptoms are associated with measures of subclinical atherosclerosis. Setting Multi-center, randomized controlled trial. Patients Recently menopausal women (n=868) screened for the Kronos Early Estrogen Prevention Study (KEEPS). Design Cross sectional analysis. Interventions None Main Outcome Measures Baseline menopausal symptoms (hot flashes, dyspareunia, vaginal dryness, night sweats, palpitations, mood swings, depression, insomnia, irritability), serum estradiol (E2) levels and measures of atherosclerosis were assessed. Atherosclerosis was quantified using Coronary Artery Calcium (CAC) Agatston scores (n=771) and Carotid Intima-Media Thickness (CIMT). Logistic regression model of menopausal symptoms and E2 was used to predict CAC. Linear regression model of menopausal symptoms and E2 was used to predict CIMT. Correlation between length of time in menopause with menopausal symptoms, estradiol (E2), CAC, and CIMT were assessed. Results In early menopausal women screened for KEEPS, neither E2 nor climacteric symptoms predicted the extent of subclinical atherosclerosis. Palpitations (p=0.09) and depression (p=0.07) approached significance as predictors of CAC. Other symptoms of insomnia, irritability, dyspareunia, hot flashes, mood swings, night sweats, and vaginal dryness were not associated with CAC. Women with significantly elevated CAC scores were excluded from further participation in KEEPS; in women meeting inclusion criteria, neither baseline menopausal symptoms nor E2 predicted CIMT. Years since menopause onset correlated with CIMT, dyspareunia, vaginal dryness and E2. Conclusions Self-reported symptoms in recently menopausal women are not strong predictors of subclinical atherosclerosis. Continued follow-up of this population will be performed to determine if baseline or persistent symptoms in the early menopause are associated with progression of cardiovascular disease.
Summary. The 'expectation-conditional maximization either' (ECME) algorithm has proven to be an effective way of accelerating the expectation-maximization algorithm for many problems. Recognizing the limitation of using prefixed acceleration subspaces in the ECME algorithm, we propose a dynamic ECME (DECME) algorithm which allows the acceleration subspaces to be chosen dynamically. The simplest DECME implementation is what we call DECME-1, which uses the line that is determined by the two most recent estimates as the acceleration subspace. The investigation of DECME-1 leads to an efficient, simple, stable and widely applicable DEC-ME implementation, which uses two-dimensional acceleration subspaces and is referred to as DECME-2. The fast convergence of DECME-2 is established by the theoretical result that, in a small neighbourhood of the maximum likelihood estimate, it is equivalent to a conjugate direction method. The remarkable accelerating effect of DECME-2 and its variant is also demonstrated with several numerical examples.
Genetic markers with rare variants are spread out in the genome, making it necessary and difficult to consider them in genetic association studies. Consequently, wisely combining rare variants into “composite” markers may facilitate meaningful analyses. In this paper, we propose a novel approach of analyzing rare variant data by incorporating the least absolute shrinkage and selection operator technique. We applied this method to the Genetic Analysis Workshop 17 data, and our results suggest that this new approach is promising. In addition, we took advantage of having 200 phenotype replications and assessed the performance of our approach by means of repeated classification tree analyses. Our method and analyses were performed without knowledge of the underlying simulating model. Our method identified 38 markers (in 65 genes) that are significantly associated with the phenotype Affected and correctly identified two causal genes, SIRT1 and PDGFD.
Existing methods for analyzing rare variant data focus on collapsing a group of rare variants into a single common variant; collapsing is based on an intuitive function of the rare variant genotype information, such as an indicator function or a weighted sum. It is more natural, however, to take into account the single-nucleotide polymorphism (SNP) interactions informed directly by the data. We propose a novel tree-based method that automatically detects SNP interactions and generates candidate markers from the original pool of rare variants. In addition, we utilize the advantage of having 200 phenotype replications in the Genetic Analysis Workshop 17 data to assess the candidate markers by means of repeated logistic regressions. This new approach shows potential in the rare variant analysis. We correctly identify the association between gene FLT1 and phenotype Affect, although there exist other false positives in our results. Our analyses are performed without knowledge of the underlying simulating model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.