Large margin classification with indefinite similarities

Alabdulmohsin, Ibrahim; Cissé, Moustapha; Gao, Xin; Zhang, Xiangliang

doi:10.1007/s10994-015-5542-8

Cited by 10 publications

(4 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, while our method beats others most of the time and consistently, it also slightly fails such in few cases proving the no free lunch theorem that is, no method performs better 100% times. Moreover, [20] discussed about SVM with indefinite kernels where it is showed that similarity function used for L 1norm or LP SVM does not need to be positive semi/definite and we use LP SVM which remains convex even if the similarity matrix is indefinite. Although in some cases compared to the sparse machines, our similarity based SVM in- As outlier patterns stay outside the decision boundary of their own classes, for them, generally, similarities to the patterns of their opposite classes are higher compared to the similarities to the patterns of their own classes; hence, ratio of the sum of similarities to opposite class with respect to own class should be higher for a good similarity function.…”

Section: Discussionmentioning

confidence: 99%

“…Learning with indefinite kernel or non-PSD similarity matrix has attracted huge concentration [11][12][13][14][15][16][17][18][19]. However, [20] have divided recent work on training SVM with indefinite kernels into three main kinds: PSD kernel approximation, non-convex optimization, and learning in Krein spaces with a conclusion that all methods are not fully adequate as they have either hosted bases of inconsistency in handling training and test patterns using kernel approximation which harms generalization guarantees or established for approximate local minimum solutions by non-convex optimization, or generated nonsparse solutions. But there is another approach that has been studied in a sequence of papers [1], [21], [13], [22] that adopt a certain "goodness" property, which is formally defined for the similarity function and provide both generalization guarantees in terms of how well-suited the similarity function is to the classification task at hand as well as the capability to use fast algorithmic techniques.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

LP SVM with A Novel Similarity function Outperforms Powerful LP-QP-Kernel-SVM Considering Efficient Classification

Karim,

Hasan,

Kundu

et al. 2023

IJCAI

View full text Add to dashboard Cite

While the core quality of SVM comes from its ability to get the global optima, its classification performance also depends on computing kernels. However, while this kernel-complexity generates the power of machine, it is also responsible for the computational load to execute this kernel. Moreover, insisting on a similarity function to be a positive definite kernel demands some properties to be satisfied that seem unproductive sometimes raising a question about which similarity measures to be used for classifier. We model Vapnik's LPSVM proposing a new similarity function replacing kernel function. Following the strategy of "Accuracy first, speed second", we have modelled a similarity function that is mathematically well-defined depending on analysis as well as geometry and complex enough to train the machine for generating solid generalization ability. Being consistent with the theory of learning by Balcan and Blum [1], our similarity function does not need to be a valid kernel function and demands less computational cost for executing compared to its counterpart like RBF or other kernels while provides sufficient power to the classifier using its optimal complexity. Benchmarking shows that our similarity function based LPSVM poses test error 0.86 times of the most powerful RBF based QP SVM but demands only 0.40 times of its computational cost.Povzetek: Za SVM je predlagana je nova funkcija podobnosti, ki zamenja funkcijo jedra, zahteva manj računanja in dosega visoko natančnost in hitrost.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

LP SVM with A Novel Similarity function Outperforms Powerful LP-QP-Kernel-SVM Considering Efficient Classification

Karim,

Hasan,

Kundu

et al. 2023

IJCAI

View full text Add to dashboard Cite

show abstract

“…There are two main directions to handle the problem of indefiniteness: using insensitive methods like indefinite kernel fisher discrimination (Haasdonk and Pekalska, 2008), empirical feature space approaches (Alabdulmohsin et al, 2016), or correcting the eigenspectrum to psd.…”

Section: Introductionmentioning

confidence: 99%

Structure Preserving Encoding of Non-euclidean Similarity Data

Münch

Raab

Biehl

et al. 2020

Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods

View full text Add to dashboard Cite

“…However, unlike supervised learning, similarity measure in the unsupervised scheme for categorical data has received much less attention until now [8], [9]. Without both the label information and the numerical attributes, it's much challengeable to distinguish different categorical values [10]. Currently, only limited efforts have been made, mainly including matching-based [11], frequency-based [12], and information theory [13] based methods.…”

Section: Introductionmentioning

confidence: 99%

Heterogeneous Graph Based Similarity Measure for Categorical Data Unsupervised Learning

Jiang

et al. 2019

IEEE Access

View full text Add to dashboard Cite

Different from numerical attributes, measuring the similarity between categorical attributes is more complex due to their non-inherently ordered characteristic, especially in an unsupervised scheme. This work, therefore, presents a new method, Heterogeneous Graph-based Similarity measure (HGS), to measure the similarity between categorical data for unsupervised learning. In order to capture the possible complex relationships hidden among attributes, a heterogeneous weighted graph is creatively constructed by extracting the information from categorical data. Both objects and attribute values are represented as nodes and their occurrence and co-occurrence relationships are shown as edges. Based on a derived node-pair graph, three rules are used to iteratively update the similarity scores between object pairs and attribute-value pairs until the scores converge. We also analyze its complexities and validate the metric properties and convergence. In experiment validation, five state-of-the-art measures are compared with HGS based on 20 UCI datasets and 6 high-dimensional datasets in the medical domain in both k-modes and spectral clustering and similarity search experiments. The results show although no measure can outperform all other measures on all datasets, HGS can perform better in both clustering and similarity search tasks on the whole. Finally, six studies further discuss the convergence, time cost, and parameter sensitivity of the HGS, explore its application to imbalanced class distribution, and compare it with its variants by different initialization and graph construction. INDEX TERMS Unsupervised learning, similarity measure, categorical data, heterogeneous graph-based similarity (HGS).

show abstract

Large margin classification with indefinite similarities

Cited by 10 publications

References 26 publications

LP SVM with A Novel Similarity function Outperforms Powerful LP-QP-Kernel-SVM Considering Efficient Classification

LP SVM with A Novel Similarity function Outperforms Powerful LP-QP-Kernel-SVM Considering Efficient Classification

Structure Preserving Encoding of Non-euclidean Similarity Data

Heterogeneous Graph Based Similarity Measure for Categorical Data Unsupervised Learning

Contact Info

Product

Resources

About