Proceedings of the 7th Workshop on Cognitive Modeling And Computational Linguistics (CMCL 2017) 2017
DOI: 10.18653/v1/w17-0706
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Japanese scrambling in the wild

Abstract: Japanese speakers have a choice between canonical SOV and scrambled OSV word order to express the same meaning. Although previous experiments examine the influence of one or two factors for scrambling in a controlled setting, it is not yet known what kinds of multiple effects contribute to scrambling. This study uses naturally distributed data to test the multiple effects on scrambling simultaneously. A regression analysis replicates the NP length effect and suggests the influence of noun types, but it provide… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
5
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 18 publications
0
5
0
Order By: Relevance
“…The effects of "long-before-short," the trend that a long constituent precedes a short one, has been reported in several studies (Asahara et al, 2018;Orita, 2017). We checked whether this effect can be captured with the LM-based method. Among the examples used in Section 5.2, we analyzed about 9.5k examples in which the position of the constituent with the largest number of chunks 17 differed between its canonical case order 18 and the order supported by LMs.…”
Section: Long-before-short Effectmentioning
confidence: 99%
See 1 more Smart Citation
“…The effects of "long-before-short," the trend that a long constituent precedes a short one, has been reported in several studies (Asahara et al, 2018;Orita, 2017). We checked whether this effect can be captured with the LM-based method. Among the examples used in Section 5.2, we analyzed about 9.5k examples in which the position of the constituent with the largest number of chunks 17 differed between its canonical case order 18 and the order supported by LMs.…”
Section: Long-before-short Effectmentioning
confidence: 99%
“…In this study, we specifically focus on the Japanese language due to its complex and flexible word order. There are many claims on the canonical word order of Japanese, and it has attracted considerable attention from linguists and natural language processing (NLP) researchers for decades (Hoji, 1985;Saeki, 1998;Miyamoto, 2002;Matsuoka, 2003;Koizumi and Tamaoka, 2004;Nakamoto et al, 2006;Shigenaga, 2014;Sasano and Okumura, 2016;Orita, 2017;Asahara et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…Because Japanese exhibits a flexible word order, potential factors that predict word orders of a given construction in Japanese have been recently delved into, particularly in the field of computational linguistics (Yamashita and Kondo, 2011;Orita, 2017). One of the major findings relevant to the current study is 'long-before-short', whereby a long noun phrase (NP) tends to be scrambled ahead of a short NP (Yamashita and Chang, 2001).…”
Section: Introductionmentioning
confidence: 96%
“…Incorporating the information status of an NP with another factor 'long-before-short' proposed in the previous studies, we built a statistical model (Sasano and Okumura, 2016) (Orita, 2017) The to predict the word orders in the DOC. One important advantage of our study is that, with the latest version of the corpus we used (See Section 3), the information status of an NP can be analyzed not simply by bipartite groups as either pronoun (given) or others (new) but by the number of coindexed items in a preceding text.…”
Section: Introductionmentioning
confidence: 99%
“…1 (Yamashita and Kondo 2011;Orita 2017;Hoji 1985;Miyagawa 1997;Matsuoka 2003) Asahara, Nambu, and Sano (2018) Asahara et al (2018) (Maekawa, Yamazaki, Ogiso, Maruyama, Ogura, Kashino, Koiso, Yamaguchi, Tanaka, and Den 2014) BCCWJ 2015Givón 1976 (assertion) Erteschik-Shir (1997,2007) (Bayesian Linear Mixed Model; (Sorensen, Hohenstein, and Vasishth 2016)…”
mentioning
confidence: 99%