Sequence pattern discovery is a key issue in multivariate time series analysis. Popular approaches first obtain the pattern of each single-variate time series and then obtain cross-variate associations. In this paper, we consider different variables at the same time during pattern construction, and propose a new type of pattern called State Transition pAttern with Periodic wildcard gaps (STAP). Compared to previous types, STAP reveals stronger cross associations among different variables and provides better interpretability for decision makers. We design an approach with two stages, namely frequent state discovery and pattern synthesis, to obtain frequent STAPs. We propose two pre-pruning and an Apriori-pruning techniques to speed up pattern discovery. We also propose two post-pruning techniques to simplify the output and a visualization way to support expert decision. Experimental results on four real-world datasets demonstrate 1) STAP captures the cross and temporal associations; 2) the five pruning and pattern synthesis techniques are quite effective; and 3) visualization technique greatly increases the readability of STAP.INDEX TERMS Cross-variate association, multivariate time series, sequence pattern.
Recently, predicting multivariate time-series (MTS) has attracted much attention to obtain richer semantics with similar or better performances. In this paper, we propose a tri-partition alphabet-based state (tri-state) prediction method for symbolic MTSs. First, for each variable, the set of all symbols, i.e., alphabets, is divided into strong, medium, and weak using two user-specified thresholds. With the tri-partitioned alphabet, the tri-state takes the form of a matrix. One order contains the whole variables. The other is a feature vector that includes the most likely occurring strong, medium, and weak symbols. Second, a tri-partition strategy based on the deviation degree is proposed. We introduce the piecewise and symbolic aggregate approximation techniques to polymerize and discretize the original MTS. This way, the symbol is stronger and has a bigger deviation. Moreover, most popular numerical or symbolic similarity or distance metrics can be combined. Third, we propose an along–across similarity model to obtain the k-nearest matrix neighbors. This model considers the associations among the time stamps and variables simultaneously. Fourth, we design two post-filling strategies to obtain a completed tri-state. The experimental results from the four-domain datasets show that (1) the tri-state has greater recall but lower precision; (2) the two post-filling strategies can slightly improve the recall; and (3) the along–across similarity model composed by the Triangle and Jaccard metrics are first recommended for new datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.