This chapter presents a corpus-based text analysis tool along with a research approach to conducting a rhetorical analysis of individual text as well as text collections. The motivation for our computational approach, the system development, evaluation, and research and educational applications are discussed. The tool, called DocuScope, supports both quantitative and quantitatively-informed qualitative analyses of rhetorical strategies found in a broad range of textual artifacts, using a standard home-grown dictionary consisting of more than 40 million unique patterns of English that are classified into over 100 rhetorical functions. DocuScope also provides an authoring environment allowing investigators to build their own customized dictionaries according to their own language theories. Research published with both the standard and customized dictionaries is discussed, as well as tradeoffs, limitations, and directions for the future.
In 2006, Thomas Orr guest edited a special issue in IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION that provided insight from corpus linguistics for professional communication [1]. Orr described the "quest" to understand and improve how professionals communicate in the workplace, citing computer-aided corpus linguistics as a useful and complementary tool for empirically oriented researchers [p. 213]. The issue showcased rhetorical and linguistic analyses of professional genres and language strategies. It also introduced readers to what is now one of the most widely used text analysis toolkits [2]. The quest to understand the nuances of professional communication using computational tools have continued since, and many researchers in our field have embraced the new interdisciplinary approach now known as data science. Our quick metadata search on the journals and conference proceedings in technical and professional communication (TPC) revealed an increasing number of articles associated with terms commonly used in data science (e.g., big data, content analysis, text mining, sentiment analysis, topic modeling, network analysis) originating from numerous disciplines (e.g., corpus linguistics, computational linguistics, artificial intelligence, statistics, business analytics). Yet, the field of TPC is just beginning to embrace the power of data-driven approaches. This special issue extends Orr's work by taking a snapshot of current work in data-driven approaches to the study of TPC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.