2020
DOI: 10.1609/aaai.v34i05.6386
|View full text |Cite
|
Sign up to set email alerts
|

Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

Abstract: A sememe is defined as the minimum semantic unit of human languages. Sememe knowledge bases (KBs), which contain words annotated with sememes, have been successfully applied to many NLP tasks. However, existing sememe KBs are built on only a few languages, which hinders their widespread utilization. To address the issue, we propose to build a unified sememe KB for multiple languages based on BabelNet, a multilingual encyclopedic dictionary. We first build a dataset serving as the seed of the multilingual semem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…We can see that the distilled version of EDSKB has overall higher consistency than HowNet, and the full version of EDSKB yields lower consistency results. It is not strange because CCSA is based on sememe prediction and according to previous work (Qi et al, 2020a), senses with more sememes usually have lower prediction performance. Since the full version of EDSKB has much more sememes per sense than HowNet, it is actually not fair to compare their consistency using CCSA.…”
Section: Intrinsic Evaluationmentioning
confidence: 98%
See 2 more Smart Citations
“…We can see that the distilled version of EDSKB has overall higher consistency than HowNet, and the full version of EDSKB yields lower consistency results. It is not strange because CCSA is based on sememe prediction and according to previous work (Qi et al, 2020a), senses with more sememes usually have lower prediction performance. Since the full version of EDSKB has much more sememes per sense than HowNet, it is actually not fair to compare their consistency using CCSA.…”
Section: Intrinsic Evaluationmentioning
confidence: 98%
“…Qi et al (2018) propose the task of cross-lingual lexical sememe prediction, aiming to extend HowNet to a new language by predicting sememes for words in that language. Qi et al (2020a) present a more efficient way to extend HowNet to other languages, i.e., building a multilingual SKB based on Babel-Net (Navigli and Ponzetto, 2012). BabelNet is composed of multilingual synsets that contain synonyms in many languages.…”
Section: Expansion Of Hownetmentioning
confidence: 99%
See 1 more Smart Citation
“…To generate Chinese distant supervision knowledge, we apply 1 Hanlp 2 to link the entities involved in Cn-SynSets to two Chinese web corpora, namely, Baidu Encyclopedia articles 1 and SogouCA. 3 Next, we generate two real-world Chinese entity synonym set datasets: BDSynSetTra and SGSynSetTra, respectively, denoted as…”
Section: Distant Supervision Knowledge Acquisitionmentioning
confidence: 99%
“…Mining entity synonym set is an important task for many entity-based downstream applications, such as knowledge graph construction [1][2][3][4], taxonomy learning [5][6][7][8], and question answering [9][10][11]. An entity synonym set usually contains several different strings representing an identical entity [12][13][14].…”
Section: Introductionmentioning
confidence: 99%