2018
DOI: 10.1002/wics.1456
|View full text |Cite
|
Sign up to set email alerts
|

Distance‐based clustering of mixed data

Abstract: Cluster analysis comprises of several unsupervised techniques aiming to identify a subgroup (cluster) structure underlying the observations of a data set. The desired cluster allocation is such that it assigns similar observations to the same subgroup. Depending on the field of application and on domain‐specific requirements, different approaches exist that tackle the clustering problem. In distance‐based clustering, a distance metric is used to determine the similarity between data objects. The distance metri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 64 publications
(40 citation statements)
references
References 54 publications
0
37
0
Order By: Relevance
“…Our dataset consisted of both categorical and continuous variables. Despite the problem of combining both categorical and continuous data, usually referred to as "mixed data", in learning algorithms is well known in literature (16) and usually requires sophisticated and dedicated strategies (17). However, it is known that some algorithms, like the ones used in this work, can manage these data (for several purposes) using simple numeric encoding (18)(19)(20)(21).…”
Section: Methodsmentioning
confidence: 99%
“…Our dataset consisted of both categorical and continuous variables. Despite the problem of combining both categorical and continuous data, usually referred to as "mixed data", in learning algorithms is well known in literature (16) and usually requires sophisticated and dedicated strategies (17). However, it is known that some algorithms, like the ones used in this work, can manage these data (for several purposes) using simple numeric encoding (18)(19)(20)(21).…”
Section: Methodsmentioning
confidence: 99%
“…As (k + T + 1) << n , we can see this is much less computationally intensive than, for instance, calculating Gower’s similarity coefficient and applying multidimensional scaling. See references [ 39 , 40 ] for comprehensive reviews of clustering methods for mixed data. where for and for .…”
Section: Methodsmentioning
confidence: 99%
“…However, they are not detailed and they concentrate on specific types of clustering algorithms. Velden et al [11] study five distance-based clustering algorithms for mixed data on three mixed datasets. They conclude that there is no single clustering approach that performs well for all the datasets.…”
Section: Survey Of Other Review Papersmentioning
confidence: 99%