This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers of British English from across the UK, recorded in the years 2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describe the main stages of the Spoken BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, and annotation. In doing so we aim to (i) encourage users of the corpus to approach the data with sensitivity to the many methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corpora of the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, both logistically and practically, than in the past.
This article focuses on how register considerations informed and guided the design of the spoken component of the British National Corpus 2014 (Spoken BNC2014). It discusses why the compilers of the corpus sought to gather recordings from just one broad spoken register – ‘informal conversation’ – and how this and other design decisions afforded contributors to the corpus much freedom with regards to the selection of situational contexts for the recordings. This freedom resulted in a high level of diversity in the corpus for situational parameters such as recording location and activity type, each of which was captured in the corpus metadata. Focussing on these parameters, this article provides evidence for functional variation among the texts in the corpus and suggests that differences such as those observed presently could be analysable within the existing frameworks for analysis of register variation in spoken and written language, such as multidimensional analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.