2011
DOI: 10.1002/sam.10105
|View full text |Cite
|
Sign up to set email alerts
|

Clustering large data sets described with discrete distributions and its application on TIMSS data set

Abstract: Symbolic data analysis is based on special descriptions of data-symbolic objects. Such descriptions preserve more detailed information about the data than the standard representations with mean values. A special kind of symbolic object is also representation with distributions. In the clustering process this representation enables us to consider the variables of all types at the same time.We present two clustering methods based on the data descriptions with discrete distributions: the adapted leaders method an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…Recent developments of multivariate techniques dealing with symbolic data include Billard and Diday (). A clustering procedure to deal precisely with our type of symbolic data has been developed (Korenjak‐Černe, Kejžar, & Batagelj, ; Korenjak‐Černe, Batagelj, & Japelj Pavešić, ) and has been termed a modal multivalued symbolic data clustering procedure. Several clustering procedures for symbolic data have been adapted and implemented in R (R Development Core Team, ) within the package Clamix (Batagelj & Kejžar, ).…”
Section: Clustering Of Symbolic Data Proceduresmentioning
confidence: 99%
“…Recent developments of multivariate techniques dealing with symbolic data include Billard and Diday (). A clustering procedure to deal precisely with our type of symbolic data has been developed (Korenjak‐Černe, Kejžar, & Batagelj, ; Korenjak‐Černe, Batagelj, & Japelj Pavešić, ) and has been termed a modal multivalued symbolic data clustering procedure. Several clustering procedures for symbolic data have been adapted and implemented in R (R Development Core Team, ) within the package Clamix (Batagelj & Kejžar, ).…”
Section: Clustering Of Symbolic Data Proceduresmentioning
confidence: 99%
“…The measure of closeness is the dissimilarity between each pair of clusters, which must be determined between the new (merged) cluster and the remaining clusters after each merge. We used the weighted clustering method based on the method presented in Batagelj et al (2011) for the case of age-sex distributions.…”
Section: The Clustering Methodsmentioning
confidence: 99%
“…In this paper, we present the results of a weighted clustering method based on richer descriptions than classical data descriptions (Batagelj et al 2011). The main advantage of this method is that the clusters are represented by real age-sex (frequency) distributions, since the population size of each sex is included in the clustering process.…”
Section: Introductionmentioning
confidence: 99%
“…In Ref , the authors use the Mahalanobis–Wasserstein distance to define a new k ‐means‐type method for histogram‐valued data. Hierarchical clustering based on classical aggregation indices has been addressed in Ref and, in Ref , the authors propose an extension of the Ward method. Irpino and Verde successfully used the Mahalanobis–Wasserstein distance for hierarchical clustering.…”
Section: Methods For the Analysis Of Symbolic Datamentioning
confidence: 99%