2008 Eighth IEEE International Conference on Data Mining 2008
DOI: 10.1109/icdm.2008.135
|View full text |Cite
|
Sign up to set email alerts
|

Text Cube: Computing IR Measures for Multidimensional Text Database Analysis

Abstract: Since Jim Gray introduced the concept of "data cube" in 1997, data cube, associated with online analytical processing (OLAP), has become a driving engine in data warehouse industry. Because the boom of Internet has given rise to an ever increasing amount of text data associated with other multidimensional information, it is natural to propose a data cube model that integrates the power of traditional OLAP and IR techniques for text. In this paper, we propose a Text-Cube model on multidimensional text database … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
75
0
1

Year Published

2009
2009
2020
2020

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 91 publications
(76 citation statements)
references
References 9 publications
0
75
0
1
Order By: Relevance
“…Previous efforts in this regard have primarily focused on term-level analytics, identifying terms that are specific to a group of documents [6,16,24]. Some [7,8,13,15,19] have taken a multi-dimensional view of the text collection and proposed OLAP-style models for performing drill-down/roll-up on text databases, but remain at the level of terms.…”
Section: Related Workmentioning
confidence: 99%
“…Previous efforts in this regard have primarily focused on term-level analytics, identifying terms that are specific to a group of documents [6,16,24]. Some [7,8,13,15,19] have taken a multi-dimensional view of the text collection and proposed OLAP-style models for performing drill-down/roll-up on text databases, but remain at the level of terms.…”
Section: Related Workmentioning
confidence: 99%
“…• Term hierarchies [9] [12] and [11] [9] proposed a new data cube called Text cube based on the star schema in which a textual dimension is represented by terms hierarchy. This hierarchy specifies the semantic relationships between textual terms extracted from documents, which allows semantic navigation in textual data thanks to two associated operators: pull-up (which generates a term level L 0 from a lower term level L) and push-down (which generates a term level L 0 from a higher term level L).…”
Section: Related Workmentioning
confidence: 99%
“…In addition to generating non-summarizable structures, TestBlox can also synthesize any desired summarizable structure. We expect that such structures would be particularly useful for validation of new kinds of decision support facilities [6,11,14,27,29,44], as they usually assume summarizable analysis spaces. For example, one could apply random testing to validate a new cloud-based OLAP engine [11,6] against a standard relational one [34], using a DAG generator [8] to synthesize random dimensional schemas, and TestBlox to synthesize random summarizable instances of those schemas.…”
Section: Introductionmentioning
confidence: 99%
“…As such, they have become indispensable tools for experts in many fields (e.g., business, medicine, education, research, and government [31]), and the demand for them is growing. With the rapid increase in the variety and volume of data being collected, there is significant interest in extending decision support facilities to both non-relational datasets (such as streams [14], sequences [29], text [27], and networks [44]) and extremely large relational datasets [11,6] that exceed storage capacities of traditional data warehouses.…”
Section: Introductionmentioning
confidence: 99%