The paper is an attempt at a quantitative corpus related approach to the subject of multilingualism in contemporary Czech poetry (published both in books and on literary servers). The authors of the paper examine the frequency and distribution of foreign (i.e., non-Czech) lexical units, raising questions about the forms and functions of individual lexemes. Three selected poets (T. Kafka, M. Šanda, M. Torčík) are then analyzed more in-depth. The paper is also a report about a currently developed database -The Corpus of Contemporary Czech Poetry -and possibilities of using it. It suggests how beneficial the quantitative data analysis in the first phase of linguistically oriented literary research can be, pointing to the necessity of interconnecting the quantitative and qualitative approaches. It is only the researcher´s interpretative competence that can define the boundaries of the research field and the significance of its elements.When conducting text-centered analyses, language corpora should begin to play a role similar to other scientific infrastructure tools, such as bibliographic databases.
Our article reports on the emerging Corpus of Contemporary Czech Poetry and the possibilities for its use. We describe the genesis of the idea of creating a specific corpus that combines the principles of synchronicity and genre instead of relying on the presence of poetry in the general corpus of contemporary Czech. We also characterize the structure of our corpus, which is designed to cover both of the basic media areas in which contemporary poetry is published and distributed: either in books or through open publishing platforms on the Internet (literary forums). We additionally describe the functionalities of the tools for mining the corpus data, which are designed to easily serve comparative analyses across media (print/web). We suggest how useful quantitative data analysis can be in the first phase of language-oriented literary research; or rather we point out the necessity of combining quantitative and qualitative approaches. Only the researcher’s interpretative proficiency can decide on the boundaries of the field under study and the meaning of the elements present in it. In text-centred analyses, language corpora should start to play a similar role as other tools of scientific infrastructure, such as bibliographic databases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.