A Distance Measure for Classifying Arima Models

Piccolo, Domenico

doi:10.1111/j.1467-9892.1990.tb00048.x

Cited by 256 publications

(146 citation statements)

References 7 publications

Supporting

Mentioning

140

Contrasting

Unclassified

Order By: Relevance

“…Piccolo (1990) proposed a metrics measuring the distance between two ARMA models, based on the comparison of the coefficients of their AR(∞) representation. This tool has had a large success in several fields; a review of its properties and applications with several references can be found in Piccolo (2007) and Corduas and Piccolo (2008).…”

Section: Unconditional Minimum and Time-varying Volatilitiesmentioning

confidence: 99%

“…In particular, he distinguishes three major categories of approaches to time series clustering: 1) raw-data-based approaches, in which the series compared are considered as normally sampled at the same interval; 2) features-based approaches, in which the series are compared using some selected features; 3) model-based methods, where the time series are considered similar when the models characterizing them are similar. The approach proposed in this work belongs to the third category; in particular it follows the tradition of AR processes to capture the similarity among time series, as in Piccolo (1990), Maharaj (1996Maharaj ( , 1999Maharaj ( , 2000, Xiong and Yeung (2002) (see Piccolo, 2007, and Corduas and Piccolo, 2008, for a review). Most of these studies are devoted to capturing the structure of the mean of the process hypothesized as generator of the data, whereas little attention was put on the variance.…”

Section: Introductionmentioning

confidence: 99%

“…Most of these studies are devoted to capturing the structure of the mean of the process hypothesized as generator of the data, whereas little attention was put on the variance. This is a correct approach when dealing with classifications based on ARMA models and in presence of homoskedastic variance (for example, the clustering methods based on the AR metrics proposed by Piccolo, 1990); in fact, in this case, the variance is a function of the process parameters, so that it is implicitly considered in the classification. Dealing with heteroskedastic time series, in which the (conditional) variance follows a stochastic process (typically a GARCH process; Engle, 1982, Bollerslev, 1986, the comparison of the dynamics of the variances is fundamental.…”

Section: Introductionmentioning

confidence: 99%

“…From this model we can separate the volatility in a constant part and a time-varying part; this subdivision can have an appealing interpretation, in particular when we use the volatility to represent the risk of the asset. The constant part of the volatility is measured in a natural way, whereas we measure the time-varying part extending the idea of distance between AR models (Piccolo, 1990) to the GARCH family. The constant part could be interpreted as the lower bound of volatility (minimum expected risk) of a certain series, whereas the unconditional volatility represents the expected risk and seems more interesting for the investor.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Clustering heteroskedastic time series by model-based procedures

Otranto¹

2008

Computational Statistics & Data Analysis

View full text Add to dashboard Cite

Section: Unconditional Minimum and Time-varying Volatilitiesmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Clustering heteroskedastic time series by model-based procedures

Otranto¹

2008

Computational Statistics & Data Analysis

View full text Add to dashboard Cite

“…In the above example, when online users are modelled as mixture of Markov chains, each user is represented by a vector consisting of the transition probabilities of the underlying Markov chain, and then the clusters can be obtained based on similarity between the estimates of transition probabilities for different users [2,7]. Similarly, when the sales pattern of an item is modelled as a mixture of time series, the clusters of items can be obtained based on similarity between the estimates of the time series parameters for each item [22,24]. Another example is clustering of similar stocks based on their β values that can be obtained by using ordinary linear regression (the Capital Asset Pricing Model) on the observed stock prices [16].…”

Section: Motivation For Clustering Of Model Parametersmentioning

confidence: 99%

Clustering data with measurement errors

Kumar¹,

Patel²

2007

Computational Statistics & Data Analysis

View full text Add to dashboard Cite

Clustering Data with Measurement Errors

Kumar

Patel

2008

Statistical Methods in E‐Commerce Research

View full text Add to dashboard Cite

Abstract. Traditional clustering methods assume that there is no measurement error, or uncertainty, associated with data. Often, however, real world applications require treatment of data that have such errors. In the presence of measurement errors, well-known clustering methods like k-means and hierarchical clustering may not produce satisfactory results. The fundamental question addressed in this paper is: "What is an appropriate clustering method in the presence of errors associated with data?" In the first part of this paper, we develop a statistical model and algorithms for clustering data in the presence of errors. We assume that the errors associated with data follow a multivariate Gaussian distribution and are independent between data points. The model uses the maximum likelihood principle and provides us with a new metric for clustering. This metric is used to develop two algorithms for errorbased clustering, hError and kError, that are generalizations of Ward's hierarchical and k-means clustering algorithms, respectively. In the second part of the paper, we discuss sets of clustering problems where error information associated with the data to be clustered is readily available and where error-based clustering is likely to be superior to clustering methods that ignore error. We give examples of the effectiveness of error-based clustering on data generated from the following statistical models: (1) sample averaging, (2) multiple linear regression, (3) ARIMA time series, and (4) Markov chain models. We present theoretical and empirical justifications for the value of error based clustering on these classes of problems.

show abstract

A Distance Measure for Classifying Arima Models

Cited by 256 publications

References 7 publications

Clustering heteroskedastic time series by model-based procedures

Clustering heteroskedastic time series by model-based procedures

Clustering data with measurement errors

Clustering Data with Measurement Errors

Contact Info

Product

Resources

About