2021
DOI: 10.3828/mlo.v0i0.364
|View full text |Cite
|
Sign up to set email alerts
|

Creating the European Literary Text Collection (ELTeC): Challenges and Perspectives

Abstract: The aim of this contribution is to reflect on the process of building the multilingual European Literary Text Collection (ELTeC) that is being created in the framework of the networking project Distant Reading for European Literary History funded by COST (European Cooperation in Science and Technology). To provide some background, we briefly introduce the basic idea of ELTeC with a focus on the overall goals and intended usage scenarios. We then describe the collection composition principles that we have deriv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 6 publications
0
7
0
Order By: Relevance
“…This study uses 10 ELTeC corpora: English, French, German, Hungarian, Norwegian, Portuguese, Romanian, Serbian, Slovenian, and Spanish. ELTeC was created to reflect a sample of novels in various languages (the core collection consisting of 10, the expanded collection of 20 languages) from 1840 to 1920 based on criteria that ensured “rough comparability” ( Schöch et al , 2021 : 4) across subcollections. Each of the corpora includes 100 public-domain novels, with diverse metadata (author name, gender, publication date, word count, etc.)…”
Section: Datamentioning
confidence: 99%
See 2 more Smart Citations
“…This study uses 10 ELTeC corpora: English, French, German, Hungarian, Norwegian, Portuguese, Romanian, Serbian, Slovenian, and Spanish. ELTeC was created to reflect a sample of novels in various languages (the core collection consisting of 10, the expanded collection of 20 languages) from 1840 to 1920 based on criteria that ensured “rough comparability” ( Schöch et al , 2021 : 4) across subcollections. Each of the corpora includes 100 public-domain novels, with diverse metadata (author name, gender, publication date, word count, etc.)…”
Section: Datamentioning
confidence: 99%
“…canonicity, reflected in reprint count). Although the editors aimed at balanced subcollections and fair distribution according to variables, not all corpora could comply with the proposed criteria (for detailed discussions of the selection criteria and the process of corpus-building, see Burnard et al , 2021 ; Herrmann et al , 2020 ; Schöch et al , 2021 ).…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Studying this relationship helps not only to expand but also to redefine how concepts of frames build literary contexts. The discovery of how frame structures affect the semantics, structure, and intertextual connections in texts stimulates further enrichment of the analysis of linguistic devices that contribute to the creation of a specific atmosphere and impression on readers (Schöch et al, 2021).…”
Section: Author Contributionsmentioning
confidence: 99%
“…In order to create representative sub-collections for the corresponding languages, the novels were selected to evenly represent (1) novels of various sizes: short (10-50,000 words), medium (50-100,000 words), and long (more than 100,000 words); (2) four 20-year time periods T1 [1840-1859], T2 [1860-1879], T3 [1880-1899], T4 ; (3) the number of reprints, as a measure of canonicity (novels known to wider audience and completely forgotten), and (4) female and male authors [32].…”
Section: Datasetmentioning
confidence: 99%