2016
DOI: 10.1007/978-3-319-48390-0_21
|View full text |Cite
|
Sign up to set email alerts
|

Combining Statistical Information and Semantic Similarity for Short Text Feature Extension

Abstract: A short text feature extension method combining statistical information and semantic similarity is proposed,Firstly, After defining the contribution of word, mutual information, an associated word-pairs set is generated by comparing the value of mutual information with threshold , then it is taken as the query words set to search for HowNet. For each word-pairs, senses are found in knowledge base HowNet, and semantic similarity of query word-pairs are calculated. Common sememe satisfied condition is added into… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 6 publications
0
1
0
Order By: Relevance
“…Web searching 4 based feature extension technologies need to interact frequently with search engines and result in high communication overhead and low efficiency for data analysis. Knowledge bases or lexical databases, such as Wikipedia and HowNet for concept taxonomies [5][6][7] or topic models 8,9 are used to enrich short text representations. However, these feature extension method has high dependencies on the integrity of external resources, and often time consuming.…”
Section: Introductionmentioning
confidence: 99%
“…Web searching 4 based feature extension technologies need to interact frequently with search engines and result in high communication overhead and low efficiency for data analysis. Knowledge bases or lexical databases, such as Wikipedia and HowNet for concept taxonomies [5][6][7] or topic models 8,9 are used to enrich short text representations. However, these feature extension method has high dependencies on the integrity of external resources, and often time consuming.…”
Section: Introductionmentioning
confidence: 99%