2004
DOI: 10.1016/j.patcog.2003.12.015
|View full text |Cite
|
Sign up to set email alerts
|

Dissimilarity learning for nominal data

Abstract: Deÿning a good distance (dissimilarity) measure between patterns is of crucial importance in many classiÿcation and clustering algorithms. While a lot of work has been performed on continuous attributes, nominal attributes are more di cult to handle. A popular approach is to use the value di erence metric (VDM) to deÿne a real-valued distance measure on nominal values. However, VDM treats the attributes separately and ignores any possible interactions among attributes. In this paper, we propose the use of adap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
25
0
2

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 52 publications
(27 citation statements)
references
References 10 publications
0
25
0
2
Order By: Relevance
“…In addition, another set of methods, e.g., [14], [16], [29], [30], [31], [32], [33], learns metrics by using side-information such as labels, to guide metric learning for categorical data. The work in [29] is the first to consider label information for categorical data similarity, which uses labels to divide data into subsets and considers the attribute value distribution within these subsets.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition, another set of methods, e.g., [14], [16], [29], [30], [31], [32], [33], learns metrics by using side-information such as labels, to guide metric learning for categorical data. The work in [29] is the first to consider label information for categorical data similarity, which uses labels to divide data into subsets and considers the attribute value distribution within these subsets.…”
Section: Related Workmentioning
confidence: 99%
“…Although these methods can learn the distance in numerical data, they cannot handle categorical data directly. While categorical input is involved in work such as [14], [15], [16], they ignore the above-discussed couplings and heterogeneities.…”
Section: Introductionmentioning
confidence: 99%
“…This often results in clusters with weak intra-similarity (Ng et al, 2007) and this may result in a loss of semantic content in a partition generated by a clustering algorithm. As concerns k-modes (Huang, 1998) algorithm, in the literature several frequency-based dissimilarity measures between categorical object are proposed (Cheng et al, 2004;Quang and Bao, 2004). The proposed dissimilarity measure for categorical objects is a frequencybased dissimilarity measure and follows the work (He et al, 2011) in which a features weighted k-modes algorithm is studied, where the weights are related to the frequency value of a category in a given cluster.…”
Section: Semantic Distance For Categorical Datamentioning
confidence: 99%
“…As mentioned earlier, similarity/dissimilarity learning are more flexible compared with distance metric learning as they do not need to follow exactly the axioms of metric. A dissimilarity learning method for nominal data is proposed in [23]. The learned metric is transitive, but it does not satisfy the triangle inequality.…”
Section: Related Workmentioning
confidence: 99%
“…The definition of similarity can vary among authors, depending on which properties are desired. In some works, similarity/dissimilarity measures are required to have certain properties such as reflexivity and transitivity, but not limited by the triangle inequality [23]. In [75], an asymmetric dissimilarity measure is learned.…”
Section: Introductionmentioning
confidence: 99%