Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology - SIGPHON '06 2006
DOI: 10.3115/1622165.1622167
|View full text |Cite
|
Sign up to set email alerts
|

Improving syllabification models with phonotactic knowledge

Abstract: We report on a series of experiments with probabilistic context-free grammars predicting English and German syllable structure. The treebank-trained grammars are evaluated on a syllabification task. The grammar used by Müller (2002) serves as point of comparison. As she evaluates the grammar only for German, we reimplement the grammar and experiment with additional phonotactic features. Using bi-grams within the syllable, we can model the dependency from the previous consonant in the onset and coda. A 10fold c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 15 publications
0
6
0
Order By: Relevance
“…For her language-independent PCFG-based approach, Müller (2006) reports 92.64% word accuracy on the set of 64K examples from CELEX using 10-fold cross-validation. The Learned EBG approach of van den Bosch (1997) achieves 97.78% word accuracy when training on approximately 60K examples.…”
Section: Syllabification Experimentsmentioning
confidence: 99%
See 2 more Smart Citations
“…For her language-independent PCFG-based approach, Müller (2006) reports 92.64% word accuracy on the set of 64K examples from CELEX using 10-fold cross-validation. The Learned EBG approach of van den Bosch (1997) achieves 97.78% word accuracy when training on approximately 60K examples.…”
Section: Syllabification Experimentsmentioning
confidence: 99%
“…With a hand-crafted grammar, Müller (2002) achieves 96.88% word accuracy on CELEX-derived syllabifications, with a training corpus of two million tokens. Without a handcrafted grammar, she reports 90.45% word accuracy (Müller, 2006). Applying a standard smoothing algorithm and fourth-order HMM, Demberg (2006) scores 98.47% word accuracy.…”
Section: Other Languagesmentioning
confidence: 99%
See 1 more Smart Citation
“…For instance, syllabification on Romanian using a simple Näive Bayes results in a low syllable error rate (SER) of 12.90% [14]. Other examples of data-driven approach models use conditional random fields [15], decision tree, random forest, support vector machine [14], unsupervised model [16], hidden Markov model [17], dropped-and-matched model [18], syllabification by analogy [13], and context-free grammars [19].…”
Section: Introductionmentioning
confidence: 99%
“…For example, a simple Näive Bayes produces quite low SER of around 12.90% for the Romanian language [20]. Some other statistical models use decision tree [20] [21], treebank [22], random forest [20], neural network [23] [24] [25] [26], support vector machine [20] [27], finite-state transducers [28] [29], context-free grammars [30], hidden Markov model [31], syllabification by analogy [19], dropped-and-matched model [32], n-gram [33], conditional random fields [34] [35], nearest neighbour [17], and unsupervised model [36].…”
Section: Introductionmentioning
confidence: 99%