Scalable Semi-Supervised SVM via Triply Stochastic Gradients

Geng, Xin; Gu, Bin; Li, Xiang; Shi, Wanli; Zheng, Guansheng; Huang, Heng

doi:10.24963/ijcai.2019/328

Cited by 12 publications

(5 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since our proposed method contains three sources of randomness, we denote our method as Triply Stochastic Gradient descent for SU Classification (TSGSU). Theoretically, we give a new theoretically analysis based on the framework in Geng et al (2019), Dai et al (2014) and prove that our method can converge to the stationary point at the rate of O( 1 √ T ) after T iterations. Our experiments on various benchmark datasets and high-dimensional datasets not only demonstrate the scalability but also show the efficiency of TSGSU compared with existing learning algorithms while retaining similar generalization performance.…”

Section: Introductionmentioning

confidence: 83%

“…It uses pseudo-random number generators to generate the random features on the fly, which highly reduces the memory requirement of RFF. Due to its superior performance, DSG has been successfully applied to scale up kernelbased algorithms in many applications, such as (Gu et al 2018b;Li et al 2017;Rahimi and Recht 2009;Le et al 2013;Shi et al 2019;Geng et al 2019). The theoretical analysis of Dai et al (2014), Gu et al (2018b), Li et al (2017), Shi et al (2019) are all based on the assumption that the objective functions of these problems are convex.…”

Section: Kernel Approximationmentioning

confidence: 99%

“…However, in many other practical applications, such as disaster resilience, medical diagnosis, and bioinformatics, massive labeled data cannot be collected easily since manually labeling the unlabeled data is time-consuming and laborious. To handle this problem, great efforts have been done in weakly-supervised classification, including semi-supervised learning (Chapelle et al 2009;Sakai et al 2017Sakai et al , 2018Geng et al 2019;Shi et al 2019;Yu et al 2019), positive unlabeled (PU) learning (du Plessis et al 2014(du Plessis et al , 2015b and one-class classification (Khan and Madden 2009;Schölkopf et al 2001).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Triply stochastic gradient method for large-scale nonlinear similar unlabeled classification

Shi

et al. 2021

Mach Learn

Self Cite

View full text Add to dashboard Cite

Similar unlabeled (SU) classification is pervasive in many real-world applications, where only similar data pairs (two data points have the same label) and unlabeled data points are available to train a classifier. Recent work has identified a practical SU formulation and has derived the corresponding estimation error bound. It evaluated SU learning with linear classifiers on medium-sized datasets. However, in practice, we often need to learn nonlinear classifiers on large-scale datasets for superior predictive performance. How this could be done in an efficient manner is still an open problem for SU classification. In this paper, we propose a scalable kernel learning algorithm for SU classification using a triply stochastic optimization framework, called TSGSU. Specifically, in each iteration, our method randomly samples an instance from the similar pairs set, an instance from the unlabeled set, and their random features to calculate the stochastic functional gradient for the model update. Theoretically, we prove that our method can converge to a stationary point at the rate of O(1∕ √ T) after T iterations. Experiments on various benchmark datasets and highdimensional datasets not only demonstrate the scalability of TSGSU but also show the efficiency of TSGSU compared with existing SU learning algorithms while retaining similar generalization performance.

show abstract

Section: Introductionmentioning

confidence: 83%

Section: Kernel Approximationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Triply stochastic gradient method for large-scale nonlinear similar unlabeled classification

Shi

et al. 2021

Mach Learn

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, solving the S 3 V M problem is very challenging for its nonconvexity and computational cost [23]. It is still an open problem to scale up S 3 V M for large-scale applications.…”

Section: Label Recommendationmentioning

confidence: 99%

Towards Visual Explainable Active Learning for Zero-Shot Classification

Jia

Zhang

2021

Preprint

View full text Add to dashboard Cite

Fig. 1. Semantic navigator is a mixed-initiative visual analytics system for zero-shot classification. (a) The machine asks contrastive questions to guide analysts to come up with new attributes. (b) The semantic map explains the machine's status and presents the label recommendations (striped contours). Analysts select partial classes as positive (solid red contours) or negative (solid blue contours) to adjust the label recommendations ((c) and (d)). (e) The line chart monitors the training accuracy for seen classes and testing accuracy for unseen classes. (f) The class-attribute matrix is built interactively via collaboration between analysts and the machine.

show abstract

“…However, RFF method needs to save large amounts of random features. Instead of saving all the random features, Dai et al, (2014) proposed DSG algorithm to use pseudorandom number generators to generate the random features on-the-fly, which has been widely used (Shi et al 2019;Geng et al 2019;Li et al 2017). Our method can be viewed as an extension of (Shi et al 2019).…”

Section: Kernel Approximationmentioning

confidence: 99%

Quadruply Stochastic Gradient Method for Large Scale Nonlinear Semi-Supervised Ordinal Regression AUC Optimization

Shi

et al. 2020

AAAI

Self Cite

View full text Add to dashboard Cite

Semi-supervised ordinal regression (S2OR) problems are ubiquitous in real-world applications, where only a few ordered instances are labeled and massive instances remain unlabeled. Recent researches have shown that directly optimizing concordance index or AUC can impose a better ranking on the data than optimizing the traditional error rate in ordinal regression (OR) problems. In this paper, we propose an unbiased objective function for S2OR AUC optimization based on ordinal binary decomposition approach. Besides, to handle the large-scale kernelized learning problems, we propose a scalable algorithm called QS3ORAO using the doubly stochastic gradients (DSG) framework for functional optimization. Theoretically, we prove that our method can converge to the optimal solution at the rate of O(1/t), where t is the number of iterations for stochastic data sampling. Extensive experimental results on various benchmark and real-world datasets also demonstrate that our method is efficient and effective while retaining similar generalization performance.

show abstract

Scalable Semi-Supervised SVM via Triply Stochastic Gradients

Cited by 12 publications

References 1 publication

Triply stochastic gradient method for large-scale nonlinear similar unlabeled classification

Triply stochastic gradient method for large-scale nonlinear similar unlabeled classification

Towards Visual Explainable Active Learning for Zero-Shot Classification

Quadruply Stochastic Gradient Method for Large Scale Nonlinear Semi-Supervised Ordinal Regression AUC Optimization

Contact Info

Product

Resources

About