Sequential Document Visualization

Mao, Yi; Dillon, Joshua V.; Lebanon, Guy

doi:10.1109/tvcg.2007.70592

Cited by 35 publications

(32 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…technique of sequential document visualization [9] has been developed to exploit the inherent sequentiality of a document. It can greatly facilitate document segmentation, topic extraction, and information retrieval.…”

Section: Fig 1 Three Main Components Of Our Approachmentioning

confidence: 99%

“…To summarize the contribution of this paper, we overcome the limitation of the previous sequential document visualization scheme [9] with a more informative and perceivable two-dimensional picture-based representation. The suite of techniques presented in this paper, including the multi-scale content summarization and focus+context visualization, can also be seamlessly incorporated with a new parametric document representation.…”

Section: Fig 1 Three Main Components Of Our Approachmentioning

confidence: 99%

See 1 more Smart Citation

Sequential document visualization based on hierarchical parametric histogram curves

Chen

Wang

Peng

et al. 2012

Tinshhua Sci. Technol.

View full text Add to dashboard Cite

Abstract:Recently, sequential document visualization has attracted much attention for its superior capability in depicting the sequential semantic progression in a single document. However, existing methods commonly take abstractive visual forms such as texts, numbers, and glyphs, and require much user expertise for document exploration. In this paper we propose a sequential visualization to represent a single document with a twodimensional picture-based storyline, which semantically enhances the comprehension of textual information. We introduce a new parametric modeling approach called the Hierarchical Parametric Histogram Curve (HPHC), which encodes the statistical progression locally and adaptively. By transforming an HPHC into the two-dimensional space with a new locality-preserving embedding algorithm, we create a mapping from points along the curve to descriptive pictures and generate the visualization result. The new representation expresses the primary content with a graphical form, and allows for efficient multi-resolution and focus+context exploration in a long document. Our approach compares favorably with previous work in that it is more intuitive and requires less user expertise. Informal evaluation shows that it is useful in quick document browsing, communication, and understanding, especially for people with low literacy skills.

show abstract

Section: Fig 1 Three Main Components Of Our Approachmentioning

confidence: 99%

Section: Fig 1 Three Main Components Of Our Approachmentioning

confidence: 99%

Sequential document visualization based on hierarchical parametric histogram curves

Chen

Wang

Peng

et al. 2012

Tinshhua Sci. Technol.

View full text Add to dashboard Cite

show abstract

“…Data type like as documents, texts (available on web or stored on disk), are likely to be viewed by specific tools found. For example, see [17,27,35,55]. Algorithms and software are also data for which there are visualization tools developed specifically for this data type.…”

Section: Analysis On the Data Type Parametermentioning

confidence: 99%

Visualization Techniques: Which is the Most Appropriate in the Process of Knowledge Discovery in Data Base?

Dias¹,

Yamaguchi²,

Rabelo³

et al. 2012

Advances in Data Mining Knowledge Discovery and Applications

View full text Add to dashboard Cite

“…On the other hand, windows that are too small increase the problem size, making it longer (and harder) to solve. This windowing approach is similar to the one used in [14] which use bigrams and trigrams as basic units for document visualization.…”

Section: 2mentioning

confidence: 99%

A visual analytics approach to model learning

Garg

Ramakrishnan

Mueller

2010

2010 IEEE Symposium on Visual Analytics Science and Technology

View full text Add to dashboard Cite

The process of learning models from raw data typically requires a substantial amount of user input during the model initialization phase. We present an assistive visualization system which greatly reduces the load on the users and makes the process of model initialization and refinement more efficient, problem-driven, and engaging. Utilizing a sequence segmentation task with a Hidden Markov Model as an example, we assign each token in the sequence a feature vector based on its various properties within the sequence. These vectors are then clustered according to similarity, generating a layout of the individual tokens in form of a node link diagram where the length of the links is determined by the feature vector similarity. Users may then tune the weights of the feature vector components to improve the segmentation, which is visualized as a better separation of the clusters. Also, as individual clusters represent different classes, the user can now work at the cluster level to define token classes, instead of labelling one entry at time. Inconsistent entries visually identify themselves by locating at the periphery of clusters, and the user then helps refine the model by resolving these inconsistencies. Our system therefore makes efficient use of the knowledge of its users, only requesting user assistance for non-trivial data items. It so allows users to visually analyze data at a higher, more abstract level, improving scalability. INTRODUCTIONWith the tremendous growth in physical and online data collection technology, we are now experiencing an explosion of digital information. Since a large amount of these data are unstructured, various machine learning techniques have been developed to assign structure to these data to make them machine readable. This process can allow the machine to reason with and draw insight from data almost automatically. However, all such tasks depend heavily on large amounts of user-tagged data as the starting point, and use various semi-supervised learning methods [19]. Due to the high user input required, such tagged data is difficult to construct. Further, data is dynamic, and as a dataset grows and changes, we might need to supplement the tagged data from time to time. We propose to make this task simpler and interactive by designing a system where the user can obtain a visual overview of the dataset, and in that visual interface only tags those data elements that the system cannot easily resolve itself.One crucial idea behind our system is that given good feature vectors to represent each data point, points that are similar will be close-by in the feature vector space. Here, we mean data-points which though rich in semantics, do not have an explicit highdimensional feature vector automatically attached to them. In such cases we need to design feature vectors to represent the semantics and structure of the data-points. We aim to achieve this in our system by designing feature vectors which encompass a data point's structure, context, and location in the dataset. If some s...

show abstract

Sequential Document Visualization

Cited by 35 publications

References 14 publications

Sequential document visualization based on hierarchical parametric histogram curves

Sequential document visualization based on hierarchical parametric histogram curves

Visualization Techniques: Which is the Most Appropriate in the Process of Knowledge Discovery in Data Base?

A visual analytics approach to model learning

Contact Info

Product

Resources

About