2020
DOI: 10.1002/sim.8513
|View full text |Cite
|
Sign up to set email alerts
|

Two‐part hidden Markov models for semicontinuous longitudinal data with nonignorable missing covariates

Abstract: This study develops a two-part hidden Markov model (HMM) for analyzing semicontinuous longitudinal data in the presence of missing covariates. The proposed model manages a semicontinuous variable by splitting it into two random variables: a binary indicator for determining the occurrence of excess zeros at all occasions and a continuous random variable for examining its actual level. For the continuous longitudinal response, an HMM is proposed to describe the relationship between the observation and unobservab… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 39 publications
0
6
0
Order By: Relevance
“…In addition to censoring in responses, longitudinal data often exhibit other complexities such as dropouts and missing data. Formal methods to address dropouts and missing data include multiple imputations, 5 weighted estimating equations, 21 latent Markov models, 22 Bayesian approaches, 23 and likelihood‐based methods 24,25 . In particular, for likelihood‐based methods, it is conceptually straightforward to develop constrained tests, but the computation can become more challenging.…”
Section: Discussionmentioning
confidence: 99%
“…In addition to censoring in responses, longitudinal data often exhibit other complexities such as dropouts and missing data. Formal methods to address dropouts and missing data include multiple imputations, 5 weighted estimating equations, 21 latent Markov models, 22 Bayesian approaches, 23 and likelihood‐based methods 24,25 . In particular, for likelihood‐based methods, it is conceptually straightforward to develop constrained tests, but the computation can become more challenging.…”
Section: Discussionmentioning
confidence: 99%
“…We considered as benchmarks five supervised methods using the labeled set alone: (i) LASSO-penalized logistic regression [16,17,34,3739], (ii) random forest (RF) [40,41], (iii) linear discriminant analysis (LDA) [42], and (iv) LSTM-gated recurrent neural network (RNN) [24,39,43,44] trained with raw feature counts C i , t , as well as (v) LDA trained with patient-timepoint embeddings generated without weights , which we refer to as LDA embed . In addition, we considered a semi-supervised benchmark: hidden markov model (HMM) [2629,45,46] with a multivariate gaussian emission trained with the weight-free embeddings . Only HMM and RNN leverage the longitudinal nature of the data, while all other comparator methods train models for predicting Y t based only on concurrent features ( C i,t or ) without considering the time sequence.…”
Section: Methodsmentioning
confidence: 99%
“…For instance, Jackson et al apply a multistage discrete HMM to aneurysm screening, Sukkar et al apply one to Alzheimer's disease, and Wang et al apply a continuous HMM to progression of chronic obstructive pulmonary disease (COPD). [26][27][28][29] While these unsupervised models produce promising computational models of disease progression, they may learn latent disease stages that are not clinically relevant.…”
Section: Introductionmentioning
confidence: 99%
“…Models for longitudinal semicontinuous data have, in particular, been receiving a lot of attention in two ways. The first approach is the two-part mixed model wherein a mixture of Bernoulli with positive support distribution is used to model zero and positive components separately (Olsen and Schafer [ 1 ]; Berk and Lachenbruch [ 2 ]; Tooze et al [ 3 ]; Su et al [ 4 , 5 ]; Liu et al [ 6 ]; Zhou et al [ 7 ]). However, Hasan et al [ 8 ] and Yan and Ma [ 9 ] pointed out that such artificial separation based on the two-part modeling method breaks down the serial patterns in the analysis of time series and longitudinal data.…”
Section: Introductionmentioning
confidence: 99%
“…In addition, based on these frequentist approaches of handling nonignorable missing response or covariate data, their Bayesian analogues have been extended to various regression models. For example, from a Bayesian perspective, see Huang et al [ 15 ] for generalized linear models with nonignorably missing covariates, Lee and Tang [ 16 ] for nonlinear structural equation models with nonignorable missing data, Tang and Zhao [ 17 ] for nonlinear reproductive dispersion mixed models for longitudinal data with nonignorable missing covariates, Tang et al [ 18 ] for a nonlinear dynamic factor analysis model with nonparametric prior and possible nonignorable missingness, Zhou et al [ 7 ] for two-part hidden Markov models for semicontinuous longitudinal data with nonignorable missing covariates, Wang and Tang [ 19 ] for Bayesian quantile regression with mixed discrete and nonignorable missing covariates, and Wang et al [ 20 ] for Bayesian latent factor on image regression with nonignorable missing data. Therefore, we propose a fully Bayesian method by which to simultaneously estimate unknown parameters, random effects and nonparametric function in a Tweedie compound Poisson partial linear mixed models on the basis of Bayesian P-spline approximation to nonparametric function in the presence of nonignorable missing covariates and responses, where the nonignorable missing data mechanism is specified by a logistic regression model.…”
Section: Introductionmentioning
confidence: 99%