2010
DOI: 10.14778/1920841.1921007
|View full text |Cite
|
Sign up to set email alerts
|

Interesting-phrase mining for ad-hoc text analytics

Abstract: Large text corpora with news, customer mail and reports, or Web 2.0 contributions offer a great potential for enhancing business-intelligence applications. We propose a framework for performing text analytics on such data in a versatile, efficient, and scalable manner. While much of the prior literature has emphasized mining keywords or tags in blogs or social-tagging communities, we emphasize the analysis of interesting phrases. These include named entities, important quotations, market slogans, and other mul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
31
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(31 citation statements)
references
References 25 publications
0
31
0
Order By: Relevance
“…It allows efficient parallel processing of data in a functional programming. In [1]a preprocessing and indexing methods for phrases, paired with new search techniques for the top-k most interesting phrases in ad-hoc subsets of the corpus are developed.…”
Section: Distributed Text Miningmentioning
confidence: 99%
See 1 more Smart Citation
“…It allows efficient parallel processing of data in a functional programming. In [1]a preprocessing and indexing methods for phrases, paired with new search techniques for the top-k most interesting phrases in ad-hoc subsets of the corpus are developed.…”
Section: Distributed Text Miningmentioning
confidence: 99%
“…To achieve a more accurate document clustering, a more informative feature word-sentence has been considered in recent research work [1]. While considering three levels (documentssentences-words) based on k-partite graph [2], to represent the data set, we are able to deal with a dependency between all of them.…”
Section: Introductionmentioning
confidence: 99%
“…More recent results include text summarization with latent semantic indexing [1], extraction of key phrases [3,23], and summarization of information from multiple documents [22]. Summarisation of scientific papers was studied by Hassan et al [9].…”
Section: Related Workmentioning
confidence: 99%
“…Bedathur et al [6] devise a new algorithm called Forward Indexing to solve this problem of reading the entire set of inverted lists. Instead of storing each phrase's inverted list of document ID, Forward Indexing stores containing phrase ID lists (called forward list) for each document.…”
Section: Introductionmentioning
confidence: 99%
“…Besides Forward Indexing, [6] proposes another algorithm Prefix-Maximal Indexing to solve the problem, with a very compact index structure in memory of only up to half of the index size of Forward Indexing. This is achieved by storing only prefix-maximal phrases for each document, instead of all the containing phrase IDs.…”
Section: Introductionmentioning
confidence: 99%