We introduce an R package PGEE that implements the penalized generalized estimating equations (GEE) procedure proposed by Wang et al. (2012) to analyze longitudinal data with a large number of covariates. The PGEE package includes three main functions: CVfit, PGEE, and MGEE. The CVfit function computes the cross-validated tuning parameter for penalized generalized estimating equations. The function PGEE performs simultaneous estimation and variable selection for longitudinal data with high-dimensional covariates; whereas the function MGEE fits unpenalized GEE to the data for comparison. The R package PGEE is illustrated using a yeast cell-cycle gene expression data set.
ABSTRACT:In this study, statistical downscaling of general circulation model (GCM) simulations to monthly inflows of Kemer Dam in Turkey under A1B, A2, and B1 emission scenarios has been performed using machine learning methods, multi-model ensemble and bias correction approaches. Principal component analysis (PCA) has been used to reduce the dimension of potential predictors of National Centers for Environmental Prediction and National Center for Atmospheric Research (NCEP/NCAR) reanalysis data. Then, the reasonable GCMs were selected by investigating the rank correlations between the selected predictors in NCEP/NCAR reanalysis data and those in GCMs for 20C3M scenario between periods 1979 and 1999. Upon the training of feedforward neural network (FFNN), least squares support vector machine (LSSVM) and relevance vector machine (RVM) downscaling models, the general performance of the downscaled predictions using NCEP/NCAR reanalysis data for Kemer watershed showed that the trained RVM model produced adequate results. The effectiveness of RVM model was illustrated by its integration with 20C3M scenario between periods 1979 and 1999 and A1B, A2, and B1 future climate scenarios between periods 2010 and 2039. Afterwards, the flow forecasts were obtained by building a multi-model ensemble through the selected GCMs followed by a bias correction approach. Finally, the significance of the probable changes in trends was identified through statistical tests based on the corrected forecasts. Results showed that decreasing flows trends in winter, spring and fall seasons have been foreseen over the study area for the period between 2010 and 2039.
pagesThis thesis study considers analysis of bivariate longitudinal binary data. We propose a model based on marginalized multilevel model framework. The proposed model consists of two levels such that the first level associates the marginal mean of responses with covariates through a logistic regression model and the second level includes subject/time specific random intercepts within a probit regression model. The covariance matrix of multiple correlated time-specific random intercepts for each subject is assumed to represent the within-subject association. The subject-specific random effects covariance matrix is further decomposed into its dependence and variance components through modified Cholesky decomposition method to handle possible computational and statistical problems that may be associated with its highdimensionality. Then the unconstrained version of resulting parameters are modelled in terms of covariates with low-dimensional regression parameters, which provides better explanations related to dependence and variance parameters and a reduction in the number of parameters to be estimated in random effects covariance matrix to avoid possible identifiability problems. Marginal correlations between responses of subjects and within the responses of a subject are derived through a Taylor series-based approximation. Data cloning computational algorithm is used to compute the maximum likelihood estimates of the parameters in the proposed model and their standard errors. The validity of the proposed model is assessed through a Monte Carlo simulation study under different scenarios, and results are observed to be at acceptable level. v Lastly, the proposed model is illustrated through Mother's Stress and Children's Morbidity study data, where both population-averaged and subject-specific interpretations are drawn through Emprical Bayes estimation of random effects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.