Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096) 1999
DOI: 10.1109/dcc.1999.755669
|View full text |Cite
|
Sign up to set email alerts
|

Text mining: a new frontier for lossless compression

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0
1

Year Published

1999
1999
2022
2022

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 40 publications
(37 citation statements)
references
References 5 publications
0
36
0
1
Order By: Relevance
“…Compression methods have already found respectable application in various areas of text mining, as recently predicted by Witten et al [33]. Some of the areas include extraction of generic entities, token segmentation, acronym extraction [32] and text categorization [32,10].…”
Section: Compression Methods In Text Miningmentioning
confidence: 99%
See 1 more Smart Citation
“…Compression methods have already found respectable application in various areas of text mining, as recently predicted by Witten et al [33]. Some of the areas include extraction of generic entities, token segmentation, acronym extraction [32] and text categorization [32,10].…”
Section: Compression Methods In Text Miningmentioning
confidence: 99%
“…Other applications of compression algorithms include identification of structured sources, where a structured source outputs specially formatted sequences, or even acronym finders where acronyms were found as the occurrences of typically capital letters with significantly different letter or language statistics [32,33].…”
Section: Compression Methods In Text Miningmentioning
confidence: 99%
“…To do so requires new techniques of information mining. We are working on these [13], but they are still in their infancy and we have not begun to apply them to harvesting information about music. Moreover, items should be cross-referenced to other musical sites.…”
Section: Browsing and Searchingmentioning
confidence: 99%
“…PST have been applied to gene/protein sequence clustering [66] and spam filtering [45]. Researchers have tried to do text classification by using PPM, PPM* and other text compression methods [5,18,43,58,64].…”
Section: The Generative Approachmentioning
confidence: 99%