Word embedding representations provide good estimates of word meaning and give state-of-the art performance in semantic tasks. Embedding approaches differ as to whether and how they account for the context surrounding a word. We present a comparison of different word and context representations on the task of proposing substitutes for a target word in context (lexical substitution). We also experiment with tuning contextualized word embeddings on a dataset of sense-specific instances for each target word. We show that powerful contextualized word representations, which give high performance in several semantics-related tasks, deal less well with the subtle in-context similarity relationships needed for substitution. This is better handled by models trained with this objective in mind, where the interdependence between word and context representations is explicitly modeled during training.
Usage similarity estimation addresses the semantic proximity of word instances in different contexts. We apply contextualized (ELMo and BERT) word and sentence embeddings to this task, and propose supervised models that leverage these representations for prediction. Our models are further assisted by lexical substitute annotations automatically assigned to word instances by context2vec, a neural model that relies on a bidirectional LSTM. We perform an extensive comparison of existing word and sentence representations on benchmark datasets addressing both graded and binary similarity. The best performing models outperform previous methods in both settings.
On current models of the language faculty, the language system is taken to be divided by an interface with systems of thought. However, thought of the type expressed in language is difficult to access in language-independent terms. Potential interdependence of the two systems can be addressed by considering language under conditions of pathological changes in the neurotypical thought process. Speech patterns seen in patients with schizophrenia and formal thought disorder (FTD) present an opportunity to do this. Here we reanalyzed a corpus of severely thought-disordered speech with a view to capture patterns of linguistic disintegration comparatively across hierarchical layers of linguistic organization: 1. Referential anomalies, subcategorized into NP type involved, 2. Argument structure, 3. Lexis, and 4. Morphosyntax. Results showed significantly higher error proportions in referential anomalies against all other domains. Morphosyntax and lexis were comparatively least affected, while argument structure was intermediate. No differential impairment was seen in definite vs. indefinite NPs, or 3 rd Person pronouns vs. lexical NPs. Statistically significant differences in error proportions emerged within the domain of pronominals, where covert pronouns were more affected than overt pronouns, and 3 rd Person pronouns more than 1 st and 2 nd Person ones. Moreover, copular clauses were more often anomalous than non-copular ones. These results provide evidence of how language and thought disintegrate together in FTD, with language disintegrating along hierarchical layers of linguistic organization and affecting specific construction types. A relative intactness of language at a procedural, morphosyntactic surface level masks a profound impairment in the referential functioning of language.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.