The objective of the study was to compare statistical methods for the analysis of binary repeated measures data with an additional hierarchical level. Random effects true models with autocorrelated ($\rho=1$, 0.9 or 0.5) subject random effects were used in this simulation study. The settings of the simulation were chosen to reflect a real veterinary somatic cell count dataset, except that the within--subject time series were balanced, complete and of fixed length (4 or 8 time points). Four fixed effects parameters were studied: binary predictors at the subject and cluster levels, respectively, a linear time effect, and the intercept. The following marginal and random effects statistical procedures were considered: ordinary logistic regression (OLR), alternating logistic regression (ALR), generalized estimating equations (GEE), marginal quasi-likelihood (MQL), penalized quasi-likelihood (PQL), pseudo likelihood (REPL), maximum likelihood (ML)
estimation and Bayesian Markov chain Monte Carlo (MCMC). The performance of these estimation procedures was compared specifically for the four fixed parameters as well as variance and correlation parameters. The findings of this study indicate that in data generated by random intercept models ($\rho=1$), the ML and MCMC procedures performed well and had fairly similar estimation errors. The PQL regression estimates were attenuated while the variance estimates were less accurate than ML and MCMC, but the direction of the bias depended on whether binomial or extra-binomial dispersion was assumed. In datasets with autocorrelation ($\rho<1$), random effects estimates procedures gave downwards biased estimates, while marginal estimates were little affected by the presence of autocorrelation. The results also indicate that in addition to ALR, a GEE procedure that accounts for clustering at the highest hierarchical level is sufficient.