Homonym normalisation by word sense clustering: a case in Japanese

Sato, Yo; Heffernan, Kevin

doi:10.18653/v1/2020.coling-main.295

Proceedings of the 28th International Conference on Computational Linguistics 2020

DOI: 10.18653/v1/2020.coling-main.295

|View full text |Cite

Homonym normalisation by word sense clustering: a case in Japanese

Yo Sato

Kevin Heffernan

Abstract: This work presents a method of word sense clustering that differentiates homonyms and merge homophones, taking Japanese as an example, where orthographical variation causes problem for language processing. It uses contextualised embeddings (BERT) to cluster tokens into distinct sense groups, and we use these groups to normalise synonymous instances to a single representative form. We see the benefit of this normalisation in language model, as well as in transliteration.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Lexical Semantic Change through Large Language Models: a Survey

Periti,

Montanelli

2024

ACM Comput. Surv.

View full text Add to dashboard Cite

Lexical Semantic Change (LSC) is the task of identifying, interpreting, and assessing the possible change over time in the meanings of a target word. Traditionally, LSC has been addressed by linguists and social scientists through manual and time-consuming analyses, which have thus been limited in terms of the volume, genres, and time-frame that can be considered. In recent years, computational approaches based on Natural Language Processing have gained increasing attention to automate LSC as much as possible. Significant advancements have been made by relying on Large Language Models (LLMs), which can handle the multiple usages of the words and better capture the related semantic change. In this article, we survey the approaches based on LLMs for LSC and we propose a classification framework characterized by three dimensions: meaning representation , time-awareness , and learning modality . The framework is exploited to i) review the measures for change assessment, ii) compare the approaches on performance, and iii) discuss the current issues in terms of scalability, interpretability, and robustness. Open challenges and future research directions about the use of LLMs for LSC are finally outlined.

show abstract

Lexical Semantic Change through Large Language Models: a Survey

Periti,

Montanelli

2024

ACM Comput. Surv.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Homonym normalisation by word sense clustering: a case in Japanese

Cited by 1 publication

References 16 publications

Lexical Semantic Change through Large Language Models: a Survey

Lexical Semantic Change through Large Language Models: a Survey

Contact Info

Product

Resources

About