Sequential data modeling and analysis have become indispensable tools for analyzing sequential data such as time-series data because a larger amount of sensed event data have become available. These methods capture the sequential structure of data of interest, such as inputoutput relationship and correlation among datasets. However, since most studies in this area are specialized or limited for their respective applications, rigorous requirement analysis on such a model has not been examined in a general point of view. Hence, we particularly examine the structure of sequential data, and extract the necessity of "state duration" and "state duration" of events for efficient and rich representation of sequential data. Specifically addressing the hidden semi-Markov model (HSMM) that represents such state duration inside a model, we attempt to newly add representational capability of state interval of events onto HSMM. To this end, we propose two extended models; one is interval state hidden semi-Markov model (IS-HSMM) to express the length of state interval with a special state node designated as "interval state node". The other is interval length probability hidden semi-Markov model (ILP-HSMM) which represents the length of state interval with a new probabilistic parameter "interval length probability." From exhaustive simulations, we show superior performances of the proposed models in comparison with HSMM. To the best of our knowledge, our proposed models are the first extensions of HMM to support state interval representation as well as state duration representation.
Analysis of sequential event data has been recognized as one of the essential tools in data modeling and analysis field. In this paper, after the examination of its technical requirements and issues to model complex but practical situation, we propose a new sequential data model, dubbed Duration and Interval Hidden Markov Model (DI-HMM), that efficiently represents "state duration" and "state interval" of data events. This has significant implications to play an important role in representing practical time-series sequential data. This eventually provides an efficient and flexible sequential data retrieval. Numerical experiments on synthetic and real data demonstrate the efficiency and accuracy of the proposed DI-HMM.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.