2021
DOI: 10.1016/j.softx.2021.100684
|View full text |Cite
|
Sign up to set email alerts
|

Document Towers: A MATLAB software implementing a three-dimensional architectural paradigm for the visual exploration of digital documents and libraries

Abstract: This article introduces the generic Document Towers paradigm, visualization, and software for visualizing the structure of paginated documents, based on the metaphor of documents-as-architecture. The Document Towers visualizations resemble three-dimensional building models and represent the physical boundaries of logical (e.g., titles, images), semantic (e.g., topics, named entities), graphical (e.g., typefaces, colors), and other types of information with spatial extent as a stack of rooms and floors. The sof… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(8 citation statements)
references
References 9 publications
0
8
0
Order By: Relevance
“…This article's contention that spatial organization is informative has also been applied to document summarization. For example, this author has introduced the Document Towers visualization paradigm, which represents the three-dimensional structure of bounding boxes of paragraphs, images, and other entities in paginated documents as architectural wire-mesh models that resemble buildings and cities [104]. The quantified Structural Information Potential is encoded as a colorcoded "ribbon", which allows users to take stock of features such as document fragmentation, regularity, and outliers without opening the document itself.…”
Section: B Layout-based Document Informativeness Triagementioning
confidence: 99%
“…This article's contention that spatial organization is informative has also been applied to document summarization. For example, this author has introduced the Document Towers visualization paradigm, which represents the three-dimensional structure of bounding boxes of paragraphs, images, and other entities in paginated documents as architectural wire-mesh models that resemble buildings and cities [104]. The quantified Structural Information Potential is encoded as a colorcoded "ribbon", which allows users to take stock of features such as document fragmentation, regularity, and outliers without opening the document itself.…”
Section: B Layout-based Document Informativeness Triagementioning
confidence: 99%
“…The Document Towers will serve as a case study to anchor the discussion with a concrete example of document structure representation. The Document Towers paradigm, visualization, and open-source software were first introduced in Atanasiu & Ingold (2021), and their usefulness for the quality control of digitization workflows has been evaluated on a real-world historical newspaper dataset at the Swiss National Library (Atanasiu 2022b). This publication in the Document Towers series intends to demonstrate (i) that the number of potential applications is much greater, (ii) that some of its key characteristics are common to other types of document structure representations, and (iii) that it facilitates the investigation of the role of such factors as spatialization, mystery, and Gesamtkunstwerk in information design.…”
Section: Case Studymentioning
confidence: 99%
“…Once object coordinates have been extracted, they are employed to create a three-dimensional graphical object; for example, by using the Document Towers open-source MATLAB software (Atanasiu & Ingold 2021). Figure 4 illustrates the process of simplification, extrusion, and projection leading from a page to a wireframe resembling an architectural model, where paragraphs and images correspond to rooms, double pages to floors, documents to buildings, and libraries to cities.…”
Section: Case Studymentioning
confidence: 99%
“…6 to 8 present layout characteristics of sample documents using the combined visual-numeric approach. For technical details on the measurements, see (Atanasiu and Ingold 2021) and (Atanasiu 2022b).…”
Section: State Of the Artmentioning
confidence: 99%
“…Further reading -The role of exploration and serendipity in libraries, and the digital technologies supporting them has been surveyed by the author in (Atanasiu 2022a, Annex). For technical aspects related to the Document Towers, the rationale of its design paradigm, its technological and cultural background, and the utility of document structure representation in a multifarious range of applications, the reader may refer to the author's dedicated publications (Atanasiu 2022a;Atanasiu and Ingold 2021).…”
Section: Introductionmentioning
confidence: 99%