2020
DOI: 10.1080/01615440.2020.1823289
|View full text |Cite
|
Sign up to set email alerts
|

Exploring the dynamic changes of key concepts of the Hungarian socialist era with natural language processing methods

Abstract: The analysis of social discourses from the perspective of historical changes deserves special attention. Such a study could play a key role in revealing social changes and latent narrative of those in power; and understanding the underlying social dynamic in a given period. Until the recent years, such issues were analyzed mainly in a qualitative approach. In our paper we present a new way of revealing/discovering and interpreting social discourses using an advanced NLP method called word embedding. Based on w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 22 publications
0
3
0
1
Order By: Relevance
“…Si bien la aplicación de estos modelos de semántica distribucional ha logrado avances prácticos importantes, la exploración del contenido informacional presente en ellos es aún un tema abierto y promisorio (Utsumi, 2020). El conocimiento sobre la naturaleza de la información allí contenida ya ha mostrado su utilidad para disciplinas tales como la psicología (McGregor, et al, 2019), la neuropsicología (Dematties et al, 2020) o la historia (Szabó et al, 2020).…”
Section: Discusión Y Conclusionesunclassified
“…Si bien la aplicación de estos modelos de semántica distribucional ha logrado avances prácticos importantes, la exploración del contenido informacional presente en ellos es aún un tema abierto y promisorio (Utsumi, 2020). El conocimiento sobre la naturaleza de la información allí contenida ya ha mostrado su utilidad para disciplinas tales como la psicología (McGregor, et al, 2019), la neuropsicología (Dematties et al, 2020) o la historia (Szabó et al, 2020).…”
Section: Discusión Y Conclusionesunclassified
“…With this tool, the text was first split into sentences, then tokenized, and finally the tokens were lemmatized. A token is a semantic unit, usually separated by spaces from other character sequences in the text (Szabó et al 2020). A token can be a word, a number, or punctuation as well.…”
Section: Feature Setmentioning
confidence: 99%
“…Just as during the project "FinUgReVita", the documents are converted into text files with the help of Optical Character Recognition, segmenting long vowels into two characters (as described in Horváth et al, 2017: 63). For the OCR analysis the open source OCR engine "tesseract" (https://github.com/tesseractocr) is going to be used, as it is freely available, and has been used for processing resources of a similar period, cultural background and amount (Szabó et al, 2020, Kmetty et al, 2020. The text files are being proofread and normalised according to the transcription used in the Mansi press and contemporary educational publications.…”
Section: Processing the Sourcesmentioning
confidence: 99%