2015
DOI: 10.19026/rjaset.10.1846
|View full text |Cite
|
Sign up to set email alerts
|

A Precise Distance Metric for Mixed Data Clustering using Chi-square Statistics

Abstract: In today's scenario, data is available as a mix of numerical and categorical values. Traditional data clustering algorithms perform well for numerical data but produce poor clustering results for mixed data. For better partitioning, the distance metric used should be capable of discriminating the data points with mixed attributes. The distance measure should appropriately balance the categorical distance as well as numerical distance. In this study we have proposed a chi-square based statistical approach to de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 16 publications
0
1
0
Order By: Relevance
“…Similar clustering results are achieved with both distance measures. Mohanavalli and Jaisakthiusing [210] use chi-square statistics for computing the weight of each feature of mixed data. The Euclidean distance for numeric features and the Hamming distance for categorical features along with these weights are used to compute the distances.…”
Section: E Othermentioning
confidence: 99%
“…Similar clustering results are achieved with both distance measures. Mohanavalli and Jaisakthiusing [210] use chi-square statistics for computing the weight of each feature of mixed data. The Euclidean distance for numeric features and the Hamming distance for categorical features along with these weights are used to compute the distances.…”
Section: E Othermentioning
confidence: 99%