A comparison of statistical methods for the analysis of binary repeated measures data with additional hierarchical structure

Masaoud, Elmabrok; Stryhn, Henrik

doi:10.47302/jsr.2020540101

Cited by 1 publication

(10 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Random effects and marginal estimation procedures were selected based on their performance in the full and balanced simulated datasets (Masaoud and Stryhn, 2020). Random effects estimation procedures included several approximation algorithms, aimed at producing estimates close to the global ML estimate without actually computing the likelihood function (Breslow, 2003).…”

Section: Estimation Proceduresmentioning

confidence: 99%

“…Marginal estimation procedures included GEE, generalized estimating equations, and some of its variants; for more details, see (Masaoud and Stryhn, 2020). For missing data scenarios involving drop-outs by an MAR process, a weighted generalized estimating equation (WGEE) procedure was employed to account for the bias induced by the MAR mechanism.…”

Section: Estimation Proceduresmentioning

confidence: 99%

“…A GEE procedure may allow an MAR process to be ignored if the working correlation structure is specified correctly (Liang and Zeger, 1986;Jansen et al, 2006); see however (Preisser et al, 2002) for examples where this does not hold. The GEE procedure was set up with either an independence or exchangeable working correlation structure at the cluster (herd) level; results from (Masaoud and Stryhn, 2020) showed that GEE with these correlations at the cluster level performed well for balanced repeated measures data with an additional hierarchical structure. The calculations involved in the weighting scheme have been detailed elsewhere (Jansen et al, 2006;Molenberghs and Verbeke, 2005, Chapter 27).…”

Section: Estimation Proceduresmentioning

confidence: 99%

“…Thus, the hierarchical structure is the clustering of cows in herds, the repeated measures are the monthly test records based on the milk samples, and the missing values are the incomplete records on each cow. A previous study by (Masaoud and Stryhn, 2020) targeted the added complexity of the additional hierarchical structure in a balanced full datasets setting, whereas the present study is focused on the missing values part. Generally, missingness in longitudinal data presents a potential source of bias.…”

Section: Introductionmentioning

confidence: 99%

“…In order to realistically reflect the choice an applied researcher faces when it comes to data analysis, only estimation procedures implemented in broadly accessible statistical software were considered for the study. Specifically, the following procedures previously studied for hierarchically structured binary repeated measures data (Masaoud and Stryhn, 2020) were included: maximum likelihood via numerical integration (ML), Bayesian Markov chain Monte Carlo (MCMC), penalized quasi-likelihood with binomial dispersion (PQL) and an extra-binomial dispersion (PQLx), ordinary logistic regression (OLR), alternating logistic regression (ALR), and weighted generalized estimating equations (WGEE). The adapted ALR macro for 3-level of clustering (Kunthel et al, 2014) is recently available when estimation of the association structure is of primary interest, though was not included in the present simulation study.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

A simulation study to assess the impact missing values on the performance of different statistical methods for analysis of binary repeated measures data with an additional hierarchical structure

Masaoud,

Stryhn

2024

J. of Statis. Res.

View full text Add to dashboard Cite

The primary objective of the study was to assess the impact of missing values on the analy- sis of binary repeated measures data with an additional hierarchical structure. One motivat- ing example for the present study was records of high somatic cell counts in milk samples obtained by approximately monthly sampling throughout the lactations of cows in dairy herds. Random effects models with autocorrelated (ρ = 1, 0.9 or 0.5) subject-level ran- dom effects were behind the simulated data. In general, the settings of the simulation were chosen to reflect a real somatic cell count dataset (scc40), except that the within-cow time series length was set to 8-time points for each cow. The estimation procedures consid- ered were: Ordinary Logistic Regression (OLR), Alternating Logistic Regression (ALR), Weighted Generalized Estimating Equations (WGEE), Penalized Quasi Likelihood (PQL), Maximum likelihood via numerical integration (ML) and Bayesian Markov chain Monte Carlo (MCMC). Multiple scenarios of simulated incomplete datasets were considered and include: a scenario corresponded to a combination of missingness patterns present in the scc40 dataset (scc40 scenario) The remaining scenarios involved only drop-outs, and corre- sponded to either moderate or high percentages of values either missing at random (MAR) or not missing at random (NMAR), respectively. In the scc40 scenario, all estimation procedures except OLR performed well and produced estimates with small relative bias (generally less than 5%) for levels of missingness that roughly corresponded to the scc40 data. In MAR missingness scenarios, some biases were found for ALR, WGEE and PQL procedures, whereas the likelihood-based procedures were largely unaffected by the miss- ing values. In NMAR scenarios, all procedures experienced similar and strong biases in the time coefficient; however, fixed effects estimates at the subject and cluster levels were relatively unaffected. Journal of Statistical Research 2023, Vol 57, No.1-2, pp.35-67

show abstract

Section: Estimation Proceduresmentioning

confidence: 99%

Section: Estimation Proceduresmentioning

confidence: 99%