Proceedings of ACL 2017, System Demonstrations 2017
DOI: 10.18653/v1/p17-4015
|View full text |Cite
|
Sign up to set email alerts
|

Scattertext: a Browser-Based Tool for Visualizing how Corpora Differ

Abstract: Scattertext is an open source tool for visualizing linguistic variation between document categories in a language-independent way. The tool presents a scatterplot, where each axis corresponds to the rankfrequency a term occurs in a category of documents. Through a tie-breaking strategy, the tool is able to display thousands of visible term-representing points and find space to legibly label hundreds of them. Scattertext also lends itself to a query-based visualization of how the use of terms with similar embed… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
70
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 103 publications
(71 citation statements)
references
References 14 publications
0
70
0
1
Order By: Relevance
“…As shown in experiments, a language model coupled with goal embedding suffers from roleswitching or confusion. It's also interesting to further dive deep with visualizations (Kessler, 2017) and quantify the impact on quality, diversity, and goal focus metrics.…”
Section: Resultsmentioning
confidence: 99%
“…As shown in experiments, a language model coupled with goal embedding suffers from roleswitching or confusion. It's also interesting to further dive deep with visualizations (Kessler, 2017) and quantify the impact on quality, diversity, and goal focus metrics.…”
Section: Resultsmentioning
confidence: 99%
“…In order to answer this, we extracted the text from these pages and trained separate "word2vec" word embedding models that locate words in a vector space, thereby representing these words' local contexts of use -such that proximate words can be understood as close in meaning (see Mikolov et al 2015). We then plotted the frequency count of words extracted from the content guidelines of each wiki using the scatter text term frequency algorithm (Kessler 2017; see figure 5). Finally, to capture the essence of contention between and within each Altpedia, we manually identified and compared terms each wiki uses to refer to one another (see figure 8).…”
Section: Method: Mapping the Partisan Epistemics Of Altpediasmentioning
confidence: 99%
“…[FP99,BRL]. Scattertext [Kes17] tackles the problem of visually comparing two corpora based on word frequencies.…”
Section: Model Visualizationmentioning
confidence: 99%