Proceedings of ACL-2016 System Demonstrations 2016
DOI: 10.18653/v1/p16-4028
|View full text |Cite
|
Sign up to set email alerts
|

new/s/leak – Information Extraction and Visualization for Investigative Data Journalists

Abstract: We present new/s/leak, a novel tool developed for and with the help of journalists, which enables the automatic analysis and discovery of newsworthy stories from large textual datasets. We rely on different NLP preprocessing steps such named entity tagging, extraction of time expressions, entity networks, relations and metadata. The system features an intuitive web-based user interface based on network visualization combined with data exploring methods and various search and faceting mechanisms. We report the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 8 publications
0
6
0
Order By: Relevance
“…To support this role of investigative journalism, we introduce the second, substantially re-engineered and improved version of our software tool new/s/leak ("network of searchable leaks") [16]. It is developed by experts from natural language processing and visualization in computer science in cooperation with journalists from "Der Spiegel", a large German news organization.…”
Section: Investigative Journalism In the Digital Agementioning
confidence: 99%
“…To support this role of investigative journalism, we introduce the second, substantially re-engineered and improved version of our software tool new/s/leak ("network of searchable leaks") [16]. It is developed by experts from natural language processing and visualization in computer science in cooperation with journalists from "Der Spiegel", a large German news organization.…”
Section: Investigative Journalism In the Digital Agementioning
confidence: 99%
“…Then, faceted search [16] is used instead of a simple keyword search, as it is more effective for professionals. Even though these frameworks offer impressive visualizations [22,3], they cannot be used for document classification, as it would require training for particular domains. With a set law domain and expert annotators, we are able to perform polarity classification as well.…”
Section: Related Workmentioning
confidence: 99%
“…Several past works in text visualization sought to bridge this gap and provide overviews of large text collections to support journalistic practices. However, many focus on high-level summaries and on showcasing structured data extracted from text (e.g., entities, relationships, and trends), dedicating little space (or providing only indirect access) to original text [20,26,38]; other more text-centered tools, on the other hand, were designed for specific types of text collections, e.g., social media posts [13] and PDF documents [8], which limits their generalizability. Text-centered analysis tools in sensemaking [29,33] and digital humanities [23,24] could support journalists in this aspect; however, these tools tend to be fairly complex and not always amenable to the stringent timeframes and dynamic requirements of the newsroom [5].…”
Section: Introductionmentioning
confidence: 99%
“…Providing top-down awareness on a corpus level (i.e., distant reading techniques) enables not only to inform corpus trends and high-level patterns but also to provide entry points for further exploration [3,22]. Common approaches include topics [3,12,14,22,26], extracted entities [14,22,33,38], relevant keywords [13,15,27], and aggregate statistics over terms and metadata [15], sometimes organized over time [13,14,27]. These elements are often interactive and serve as content filters, linking back to particular text mentions.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation