“…where D(x) is the set of all possible dependency arcs of sentence x, 1[·] is the indicator function, and µ(x, i, j) is the expected count defined as follows, (Jiang et al, 2016), and Convex-MST (Grave and Elhadad, 2015) Methods WSJ10 WSJ Basic Setup Feature DMV (Berg-Kirkpatrick et al, 2010) 63.0 -UR-A E-DMV (Tu and Honavar, 2012) 71.4 57.0 Neural E-DMV (Jiang et al, 2016) 69.7 52.5 Neural E-DMV (Good Init) (Jiang et al, 2016) 72.5 57.6 Basic Setup + Universal Linguistic Prior Convex-MST (Grave and Elhadad, 2015) 60.8 48.6 HDP-DEP (Naseem et al, 2010) 71.9 -CRFAE 71.7 55.7 Systems Using Extra Info LexTSG-DMV (Blunsom and Cohn, 2010) 67.7 55.7 CS (Spitkovsky et al, 2013) 72.0 64.4 MaxEnc (Le and Zuidema, 2015) 73.2 65.8 Table 3: Comparison of recent unsupervised dependency parsing systems on English. Basic setup is the same as our setup except that linguistic prior is not used.…”