Jouni Helske scite author profile

Sequence analysis is being more and more widely used for the analysis of social sequences and other multivariate categorical time series data. However, it is often complex to describe, visualize, and compare large sequence data, especially when there are multiple parallel sequences per subject. Hidden (latent) Markov models (HMMs) are able to detect underlying latent structures and they can be used in various longitudinal settings: to account for measurement error, to detect unobservable states, or to compress information across several types of observations. Extending to mixture hidden Markov models (MHMMs) allows clustering data into homogeneous subsets, with or without external covariates.The seqHMM package in R is designed for the efficient modeling of sequences and other categorical time series data containing one or multiple subjects with one or multiple interdependent sequences using HMMs and MHMMs. Also other restricted variants of the MHMM can be fitted, e.g., latent class models, Markov models, mixture Markov models, or even ordinary multinomial regression models with suitable parameterization of the HMM.Good graphical presentations of data and models are useful during the whole analysis process from the first glimpse at the data to model fitting and presentation of results. The package provides easy options for plotting parallel sequence data, and proposes visualizing HMMs as directed graphs.

show abstract

KFAS: Exponential Family State Space Models in R

Helske¹

2017

J. Stat. Soft.

View full text Add to dashboard Cite

State space modeling is an efficient and flexible method for statistical inference of a broad class of time series and other data. This paper describes the R package KFAS for state space modeling with the observations from an exponential family, namely Gaussian, Poisson, binomial, negative binomial and gamma distributions. After introducing the basic theory behind Gaussian and non-Gaussian state space models, an illustrative example of Poisson time series forecasting is provided. Finally, a comparison to alternative R packages suitable for non-Gaussian time series modeling is presented.

show abstract

Introducing libeemd: a program package for performing the ensemble empirical mode decomposition

2015

View full text Add to dashboard Cite

The ensemble empirical mode decomposition (EEMD) and its complete variant (CEEMDAN) are adaptive, noise-assisted data analysis methods that improve on the ordinary empirical mode decomposition (EMD). All these methods decompose possibly nonlinear and/or nonstationary time series data into a finite amount of components separated by instantaneous frequencies. This decomposition provides a powerful method to look into the different processes behind a given time series data, and provides a way to separate short time-scale events from a general trend.We present a free software implementation of EMD, EEMD and CEEMDAN and give an overview of the EMD methodology and the algorithms used in the decomposition. We release our implementation, libeemd, with the aim of providing a user-friendly, fast, stable, well-documented and easily extensible EEMD library for anyone interested in using (E)EMD in the analysis of time series data. While written in C for numerical efficiency, our implementation includes interfaces to the Python and R languages, and interfaces to other languages are straightforward.

show abstract

Combining Sequence Analysis and Hidden Markov Models in the Analysis of Complex Life Sequence Data

Helske

Eerola

2018

View full text Add to dashboard Cite

Longitudinal data often consists of multiple parallel sequences that ought to be analyzed jointly. For example, life course data may contain sequences of employment, family formation, and residence. Such data is often referred to as multichannel or multidimensional sequence data. A multichannel approach often gives a simpler representation of the data as opposed to combining states across life domains (the extended alphabet approach); the latter approach rapidly grows the state space as the number of channels and/or states grows. If some data is only partially observed, the multichannel approach also allows for handling data as it is instead of having to make difficult decisions on how to combine observed and unobserved states (Helske and Helske 2018). Joint analysis of complex multidimensional data poses several challenges. Multichannel sequence analysis (Gauthier et al. 2010) has been the standard tool for the analysis of multichannel sequence data (for empirical applications see, e.g.,

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jouni Helske

Mixture Hidden Markov Models for Sequence Data: The seqHMM Package in R

KFAS: Exponential Family State Space Models in R

Introducing libeemd: a program package for performing the ensemble empirical mode decomposition

Combining Sequence Analysis and Hidden Markov Models in the Analysis of Complex Life Sequence Data

Contact Info

Product

Resources

About