2015
DOI: 10.7763/ijmlc.2015.v5.503
|View full text |Cite
|
Sign up to set email alerts
|

TR-LDA: A Cascaded Key-Bigram Extractor for Microblog Summarization

Abstract: Abstract-Microblog summarization can save large amount of time for users in browsing. However, it is more challenging to summarize microblog than traditional documents due to the heavy noise and severe sparsity of posts. In this paper, we propose an unsupervised method named TR-LDA for summarizing microblog by cascading two key-bigram extractors based on TextRank and Latent Dirichlet Allocation (LDA). Cascading strategy contributes to a key-bigram set with better noise immunity. Two sentence ranking strategies… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…The 10 most frequent bigrams 2 (Stanisław & Emmanuel, 2014; Wu et al, 2015) were used to interpret the following 11 topics: T1 = family and community , T2 = family and parenting , T3 = parental incarceration impact , T4 = ethnicity, gender and suicide attempt , T5 = sexual behavior and sexual minority , T6 = alcohol and childhood maltreatment , T7 = transition to adulthood and smoking initiation , T8 = birth weight and weight perception , T9 = violence victimization , T10 = twins and self‐control , and T11 = peer network and delinquent behavior (see Figure 7).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The 10 most frequent bigrams 2 (Stanisław & Emmanuel, 2014; Wu et al, 2015) were used to interpret the following 11 topics: T1 = family and community , T2 = family and parenting , T3 = parental incarceration impact , T4 = ethnicity, gender and suicide attempt , T5 = sexual behavior and sexual minority , T6 = alcohol and childhood maltreatment , T7 = transition to adulthood and smoking initiation , T8 = birth weight and weight perception , T9 = violence victimization , T10 = twins and self‐control , and T11 = peer network and delinquent behavior (see Figure 7).…”
Section: Resultsmentioning
confidence: 99%
“…This discrepancy may have various explanations. For instance, previous studies have shown that the results of TMA can be influenced by textual features (Blei, 2012), genres, and/or lengths (Wu et al, 2015). Unlike in Leydesdorff and Nerghes's (2017) work, where free text was used, we benefited from using a more structured citation database that included a variety of features of bibliographic records.…”
Section: Discussionmentioning
confidence: 99%
“…Palomino (2011) presented a comparison of three unsupervised algorithms for keyword extractions with respect to Belga News Archive and showed that TextRank was the most successful one compared with the other two algorithms through information radius and chi-square test. Wu (2015) also proposed an unsupervised method named TR-LDA for summarizing microblog by cascading two key-bigram extractors based on TextRank and Latent Dirichlet Allocation (LDA), where two sentence ranking strategies were used based on the keybigram sets. Some successful applications about TextRank includes multi-document summarization (Wan, 2007) and information retrieval (Blanco & Lioma, 2007;Blanco & Lioma, 2012), which all achieved good performance results.…”
Section: Introductionmentioning
confidence: 99%