Abstract. Traditional clustering methods assume that there is no measurement error, or uncertainty, associated with data. Often, however, real world applications require treatment of data that have such errors. In the presence of measurement errors, well-known clustering methods like k-means and hierarchical clustering may not produce satisfactory results. The fundamental question addressed in this paper is: "What is an appropriate clustering method in the presence of errors associated with data?" In the first part of this paper, we develop a statistical model and algorithms for clustering data in the presence of errors. We assume that the errors associated with data follow a multivariate Gaussian distribution and are independent between data points. The model uses the maximum likelihood principle and provides us with a new metric for clustering. This metric is used to develop two algorithms for errorbased clustering, hError and kError, that are generalizations of Ward's hierarchical and k-means clustering algorithms, respectively. In the second part of the paper, we discuss sets of clustering problems where error information associated with the data to be clustered is readily available and where error-based clustering is likely to be superior to clustering methods that ignore error. We give examples of the effectiveness of error-based clustering on data generated from the following statistical models: (1) sample averaging, (2) multiple linear regression, (3) ARIMA time series, and (4) Markov chain models. We present theoretical and empirical justifications for the value of error based clustering on these classes of problems.