2015
DOI: 10.1007/s10044-015-0485-z
|View full text |Cite
|
Sign up to set email alerts
|

Proximal gradient method for huberized support vector machine

Abstract: The Support Vector Machine (SVM) has been used in a wide variety of classification problems. The original SVM uses the hinge loss function, which is non-differentiable and makes the problem difficult to solve in particular for regularized SVMs, such as with $\ell_1$-regularization. This paper considers the Huberized SVM (HSVM), which uses a differentiable approximation of the hinge loss function. We first explore the use of the Proximal Gradient (PG) method to solving binary-class HSVM (B-HSVM) and then genera… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
8

Relationship

3
5

Authors

Journals

citations
Cited by 27 publications
(11 citation statements)
references
References 50 publications
0
11
0
Order By: Relevance
“…We will show that a range of nontrivial ω k > 0 always exists to satisfy Condition 2.1 under a mild assumption, and thus one can backtrack ω k to ensure F (x k ) ≤ F (x k−1 ), ∀k. Maintaining the monotonicity of F (x k ) can significantly improve the numerical performance of the algorithm, as shown in our numerical results below and also in [44,55]. Note that subsequence convergence does not require this condition.…”
Section: )mentioning
confidence: 62%
“…We will show that a range of nontrivial ω k > 0 always exists to satisfy Condition 2.1 under a mild assumption, and thus one can backtrack ω k to ensure F (x k ) ≤ F (x k−1 ), ∀k. Maintaining the monotonicity of F (x k ) can significantly improve the numerical performance of the algorithm, as shown in our numerical results below and also in [44,55]. Note that subsequence convergence does not require this condition.…”
Section: )mentioning
confidence: 62%
“…Hence, the condition in (2.4) can be slightly stronger that in (2.6). The condition in (2.4) holds if dom(h) is bounded and naturally holds if { f j } are linear or the logistic loss or the huberized hinge loss functions Xu et al (2016).…”
Section: Assumptionsmentioning
confidence: 99%
“…After this, a projection step is performed to project x t+ 1 2 onto X . In particular, two scenarios may happen: (i) x t+ 1 2 ∈ X , in this case, the projection step directly returns x t+ 1 2 ; (ii) x t+ 1 2 / ∈ X , we need to project x t+ 1 2 to the closest point in X , which is very time-consuming.…”
Section: Fully Projection-free Proximal Stochastic Gradientmentioning
confidence: 99%
“…P ROXIMAL stochastic gradient method is widely used to solve large-scale machine learning problems such as support vector machines [1,2] and logistic regression [3]. Generally, it iteratively finds a descent direction, and then updates the model within a feasible set by following the direction until convergence.…”
Section: Introductionmentioning
confidence: 99%