Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015
DOI: 10.18653/v1/d15-1032
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Analysis of Optimization for Max-Margin NLP

Abstract: Despite the convexity of structured maxmargin objectives Tsochantaridis et al., 2004), the many ways to optimize them are not equally effective in practice. We compare a range of online optimization methods over a variety of structured NLP tasks (coreference, summarization, parsing, etc) and find several broad trends. First, margin methods do tend to outperform both likelihood and the perceptron. Second, for max-margin objectives, primal optimization methods are often more robust and progress faster than dual … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 22 publications
0
8
0
Order By: Relevance
“…We used a structured SVM objective (Taskar et al, 2005;Tsochantaridis et al, 2004) optimized with stochastic subgradient descent (Kummerfeld et al, 2015). Our implementation used a structured prediction library (Kummerfeld et al, 2015) available at https:// github.com/tberg12/murphy.git; all models evaluated here were built on this framework (available upon request). Training was relatively fast, with the most complex model training in <15 min.…”
Section: Hidden Semi-markov Modelmentioning
confidence: 99%
“…We used a structured SVM objective (Taskar et al, 2005;Tsochantaridis et al, 2004) optimized with stochastic subgradient descent (Kummerfeld et al, 2015). Our implementation used a structured prediction library (Kummerfeld et al, 2015) available at https:// github.com/tberg12/murphy.git; all models evaluated here were built on this framework (available upon request). Training was relatively fast, with the most complex model training in <15 min.…”
Section: Hidden Semi-markov Modelmentioning
confidence: 99%
“…Code for the parser, for conversion to and from our representation, and for our metrics is available 3 . Our parser uses a linear discriminative model, with features based on McDonald et al (2005) with an online primal subgradient approach (Ratliff et al 2007) as described by Kummerfeld, Berg-Kirkpatrick, et al (2015), with parallel lock-free sparse updates. Loss Function: We use a weighted Hamming distance for loss-augmented decoding, as it can be efficiently decomposed within our dynamic program.…”
Section: Methodsmentioning
confidence: 99%
“…In the NLP literature, margin based learning has been applied to parsing (Taskar et al, 2004;McDonald et al, 2005), text classification (Taskar et al, 2003), machine translation (Watanabe et al, 2007) and semantic parsing (Iyyer et al, 2017). Kummerfeld et al (2015) found that max-margin based methods generally outperform likelihood maximization on a range of tasks. Previous work have studied connections between margin based method and likelihood maximization for supervised learning setting.…”
Section: Related Workmentioning
confidence: 99%