Collaborative problem solving (CPS) is inherently an interactive, conjoint, dual-strand process that considers how a student reasons about a problem as well as how s/he interacts with others to regulate social processes and exchange information (OECD, 2013). Measuring CPS skills presents a challenge for obtaining consistent, accurate, and reliable scale across individuals and user populations. The Programme for International Student Assessment (PISA)’s 2015 cycle first introduced an assessment of CPS in international large-scale assessments in which computer-based conversational agents were adapted to represent team members with a range of skills and abilities. This study draws on measures of the CPS domain in PISA 2015 to address the challenges and solutions related to CPS item design and shed lights on sequential conversation-based measurement. Specifically, we present the process of CPS item design, the development of scoring rules through CPS conversation paths, and discuss the possible approaches to better estimate CPS beyond item response models.