This paper describes a joint model of word segmentation and phonological alternations, which takes unsegmented utterances as input and infers word segmentations and underlying phonological representations. The model is a Maximum Entropy or log-linear model, which can express a probabilistic version of Optimality Theory (OT; Prince and Smolensky (2004)), a standard phonological framework. The features in our model are inspired by OT's Markedness and Faithfulness constraints. Following the OT principle that such features indicate "violations", we require their weights to be non-positive. We apply our model to a modified version of the Buckeye corpus (Pitt et al., 2007) in which the only phonological alternations are deletions of word-final /d/ and /t/ segments. The model sets a new state-ofthe-art for this corpus for word segmentation, identification of underlying forms, and identification of /d/ and /t/ deletions. We also show that the OT-inspired sign constraints on feature weights are crucial for accurate identification of deleted /d/s; without them our model posits approximately 10 times more deleted underlying /d/s than appear in the manually annotated data.
The computation speed for distance transforms becomes important in a wide variety of image processing applications. Current ITK library filters do not see any benefit from a multithreading environment. We introduce a three-dimensional signed parallel implementation of the exact Euclidean distance transform algorithm developed by Maurer et al. with a theoretical complexity of O(n/p) for n voxels and p threads. Through this parallelization and efficient use of data structures we obtain approximately 3 times mean speedup on standard tests on a 4-processor machine compared with the current ITK exact Euclidean distance transform filter.
<p>Exceptions to morphological regularities often pattern together phonologically. In the English past tense, exceptions to the regular ‘Add /-d/’ rule frequently inhabit ‘Islands of Reliability’ (Albright & Hayes, 2003), in which a group of words take the same irregular past and also pattern together on a set of phonological characteristics. Adults seem to have implicit knowledge of both the overall pattern (the regular past) and the ‘subgeneralizations’.</p><p>We model this knowledge of subgeneralizations through the interaction of a structured lexicon and a Maximum Entropy grammar. Words that pattern together with respect to a particular morphological process are grouped into a ‘bundle’, which is indexed to a constraint expressing the change that these words undergo to realize the morpheme. These ‘operational constraints’ compete with markedness and faithfulness in the phonological component. The phonological regularity of a bundle is represented by the average of constraint violations for members. Novel words are assigned a bundle on the basis of similarity to these averages.</p><p>Our model shows promising correspondence with human data, including biases toward regularity and Island of Reliability effects. The model’s joint learning approach to phonology and morphology, as well as an inclusive concept of `context’, show promise for future application.</p>
The experimental study of artificial language learning has become a widely used means of investigating the predictions of theories of language learning and representation. Although much is now known about the generalizations that learners make from various kinds of data, relatively little is known about how those representations affect speech processing. This paper presents an event-related potential (ERP) study of brain responses to violations of lab-learned phonotactics. Novel words that violated a learned phonotactic constraint elicited a larger Late Positive Component (LPC) than novel words that satisfied it. Similar LPCs have been found for violations of natively acquired linguistic structure, as well as for violations of other types of abstract generalizations, such as musical structure. We argue that lab-learned phonotactic generalizations are represented abstractly and affect the evaluation of speech in a manner that is similar to natively acquired syntactic and phonological rules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.