When assessing surrogate endpoints in clinical studies under a causal-inference framework, a simulation-based sensitivity analysis is required, so as to sample the unidentifiable parameters across plausible values. To be precise, correlation matrices need to be sampled with only some of their entries identified from the data, known as the matrix completion problem. The positivedefiniteness constraints are cumbersome functions involving all matrix entries, making this a * To whom correspondence should be addressed.
Estimating complex linear mixed models using an iterative full maximum likelihood estimator can be cumbersome in some cases. With small and unbalanced datasets, convergence problems are common. Also, for large datasets, iterative procedures can be computationally prohibitive. To overcome these computational issues, an unbiased two-stage closed-form estimator for the multivariate linear mixed model is proposed. It is rooted in pseudo-likelihood-based split-sample methodology, and useful, for example, when evaluating normally distributed endpoints in a meta-analytic context. However, applications go well beyond this framework. Its statistical and computational performance is assessed via simulation. The method is applied to a study in schizophrenia.
Clustered count data are commonly analysed by the generalized linear mixed model (GLMM). Here, the correlation due to clustering and some overdispersion is captured by the inclusion of cluster-specific normally distributed random effects. In some cases, the model does not capture the variability completely. Therefore, the GLMM can be extended by including a set of gamma random effects. Routinely, the GLMM is fitted by maximising the marginal likelihood. However, the whole maximisation process is computationally intensive. Although feasible with medium to large data, it can be too time-consuming or computationally intractable with very large data (overall sample and/or cluster size). Therefore, a less computationally intensive twostage estimator for correlated, overdispersed count data is proposed. It is rooted in the pseudo-likelihood split-sample methodology. Based on a simulation study, it shows good statistical properties. Furthermore, it is computationally much faster than the full maximum likelihood estimator. The approach is illustrated using a large dataset belonging to a network of Belgian general practices.
La contaminación del aire por monóxido de carbono (CO) es uno de los principales factores que afecta la calidad del aire en las grandes ciudades, pues está directamente relacionado con las actividades urbanas. El comportamiento medio y de la variabilidad de las concentraciones de CO a lo largo de un día varía constantemente debido principalmente al tráfico vehicular en el lugar. El objetivo de este trabajo es proponer un modelo de suavización no paramétrico para la concentración horaria de CO en el aire, considerando varianza no constante, que permita describir su comportamiento a lo largo de un día. Para esto se usaron los registros de contaminación por CO en una estación ubicada en el centro de la ciudad de Cali, Colombia. Se estimaron las curvas por medio de regresión lineal local y la función de varianza por medio de un estimador de la función de varianza. Las curvas estimadas permitieron describir el comportamiento del CO, mostrando mayores concentraciones en horas “pico” y menores en la madrugada, además la estimación de una función de varianza permitió modelar de mejor forma el comportamiento heterocedástico de los datos.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.