“…The analysis of communication data (or discourse analysis as it is often called in the computer-supported collaborative learning (CSCL) community) usually starts with the coding or labeling of each turn (or several turns that constitute large speech units) of communications based on a framework (rubrics) being developed to address specific research questions. For example, a number of coding frameworks have been developed to analyze different aspects of the communications among team members, such as the coding framework for collaborative problem solving (CPS) skills (Liu et al, 2015), for the interactive patterns in collaboration (Andrews et al, 2017), for cohesion and language (Graesser et al, 2004; Dowell et al, 2016), and for dialog acts (Allen and Core, 1997). Based on human-coded discourse, natural language processing (NLP) techniques can be employed to automate the annotation to an accuracy level that is close to human coding (Rosé et al, 2008; Rus et al, 2015; Flor et al, 2016; Hao et al, 2017a).…”