Dependency Grammar Induction with Neural Lexicalization and Big Training Data

Han, Wenjuan; Jiang, Yong; Tu, Kewei

doi:10.18653/v1/d17-1176

Cited by 19 publications

(22 citation statements)

References 10 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Recent work has sought to take advantage of word embeddings in unsupervised generative models with alternate approaches (Lin et al, 2015;Tran et al, 2016;Jiang et al, 2016;Han et al, 2017). Lin et al (2015) build an HMM with Gaussian emissions on observed word embeddings, but they do not attempt to learn new embeddings.…”

Section: Syntax Modelmentioning

confidence: 99%

“…w/ gold POS tags (for reference only) DMV (Klein and Manning, 2004) 55.1 39.7 UR-A E-DMV (Tu and Honavar, 2012) 71.4 57.0 MaxEnc (Le and Zuidema, 2015) 73.2 65.8 Neural E-DMV (Jiang et al, 2016) 72.5 57.6 CRFAE (Cai et al, 2017) 71.7 55.7 L-NDMV (Big training data) (Han et al, 2017) 77.2 63.2 Table 2: Directed dependency accuracy on section 23 of WSJ, evaluating on sentences of length 10 and all lengths. Starred entries ( * ) denote that the system benefits from additional punctuation-based constraints.…”

Section: Unsupervised Dependency Parsing Without Gold Pos Tagsmentioning

confidence: 99%

“…Lin et al (2015) build an HMM with Gaussian emissions on observed word embeddings, but they do not attempt to learn new embeddings. Tran et al ( 2016), Jiang et al (2016), and Han et al (2017) extend HMM or dependency model with valence (DMV) (Klein and Manning, 2004) with multinomials that use word (or tag) embeddings in their parameterization. However, they do not represent the embeddings as latent variables.…”

Section: Syntax Modelmentioning

confidence: 99%

“…System 10 all w/o gold POS tags DMV (Klein and Manning, 2004) 49.6 35.8 E-DMV (Headden III et al, 2009) 52.1 38.2 UR-A E-DMV (Tu and Honavar, 2012) 58.9 46.1 CS * (Spitkovsky et al, 2013) 72.0 * 64.4 * Neural E-DMV (Jiang et al, 2016) 55.3 42.7 CRFAE (Cai et al, 2017) 37. (Klein and Manning, 2004) 55.1 39.7 UR-A E-DMV (Tu and Honavar, 2012) 71.4 57.0 MaxEnc (Le and Zuidema, 2015) 73.2 65.8 Neural E-DMV (Jiang et al, 2016) 72.5 57.6 CRFAE (Cai et al, 2017) 71.7 55.7 L-NDMV (Big training data) (Han et al, 2017) 77.2 63.2 parameters are initialized in the same way as in the POS tagging experiment. The directed dependency accuracy (DDA) is used for evaluation and we report accuracy on sentences of length 10 and all lengths.…”

Section: Unsupervised Dependency Parsing Without Gold Pos Tagsmentioning

confidence: 99%

See 3 more Smart Citations

Unsupervised Learning of Syntactic Structure with Invertible Neural Projections

Neubig

Berg-Kirkpatrick

2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Unsupervised learning of syntactic structure is typically performed using generative models with discrete latent variables and multinomial parameters. In most cases, these models have not leveraged continuous word representations. In this work, we propose a novel generative model that jointly learns discrete syntactic structure and continuous word representations in an unsupervised fashion by cascading an invertible neural network with a structured generative prior. We show that the invertibility condition allows for efficient exact inference and marginal likelihood computation in our model so long as the prior is well-behaved. In experiments we instantiate our approach with both Markov and tree-structured priors, evaluating on two tasks: part-of-speech (POS) induction, and unsupervised dependency parsing without gold POS annotation. On the Penn Treebank, our Markov-structured model surpasses state-of-the-art results on POS induction. Similarly, we find that our tree-structured model achieves state-of-the-art performance on unsupervised dependency parsing for the difficult training condition where neither gold POS annotation nor punctuation-based constraints are available. 1

show abstract

Section: Syntax Modelmentioning

confidence: 99%

Section: Unsupervised Dependency Parsing Without Gold Pos Tagsmentioning

confidence: 99%

Section: Syntax Modelmentioning

confidence: 99%

Section: Unsupervised Dependency Parsing Without Gold Pos Tagsmentioning

confidence: 99%

See 2 more Smart Citations

Unsupervised Learning of Syntactic Structure with Invertible Neural Projections

Neubig

Berg-Kirkpatrick

2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…Other work has used neural parameterization for structured models, such as dependency models(Han et al, 2017), hidden semi-Markov models(Wiseman et al, 2018), and context free grammars(Kim et al, 2019).…”

mentioning

confidence: 99%

Scaling Hidden Markov Language Models

Chiu¹,

Rushton²

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

The hidden Markov model (HMM) is a fundamental tool for sequence modeling that cleanly separates the hidden state from the emission structure. However, this separation makes it difficult to fit HMMs to large datasets in modern NLP, and they have fallen out of use due to very poor performance compared to fully observed models. This work revisits the challenge of scaling HMMs to language modeling datasets, taking ideas from recent approaches to neural modeling. We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization. Experiments show that this approach leads to models that are more accurate than previous HMM and n-gram-based methods, making progress towards the performance of state-of-the-art neural models.

show abstract

Unsupervised Grammar Induction with Depth-bounded PCFG

Jin

Doshi‐Velez

Miller

et al. 2018

TACL

View full text Add to dashboard Cite

There has been recent interest in applying cognitively or empirically motivated bounds on recursion depth to limit the search space of grammar induction models (Ponvert et al., 2011;Noji and Johnson, 2016;Shain et al., 2016). This work extends this depth-bounding approach to probabilistic context-free grammar induction (DB-PCFG), which has a smaller parameter space than hierarchical sequence models, and therefore more fully exploits the space reductions of depth-bounding. Results for this model on grammar acquisition from transcribed childdirected speech and newswire text exceed or are competitive with those of other models when evaluated on parse accuracy. Moreover, grammars acquired from this model demonstrate a consistent use of category labels, something which has not been demonstrated by other acquisition models.

show abstract

Dependency Grammar Induction with Neural Lexicalization and Big Training Data

Cited by 19 publications

References 10 publications

Unsupervised Learning of Syntactic Structure with Invertible Neural Projections

Unsupervised Learning of Syntactic Structure with Invertible Neural Projections

Scaling Hidden Markov Language Models

Unsupervised Grammar Induction with Depth-bounded PCFG

Contact Info

Product

Resources

About