There is a wide set of evaluation metrics available to compare the quality of text clustering algorithms. In this article, we define a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families. These formal constraints are validated in an experiment involving human assessments, and compared with other constraints proposed in the literature. Our analysis of a wide range of metrics shows that only BCubed satisfies all formal constraints.We also extend the analysis to the problem of overlapping clustering, where items can simultaneously belong to more than one cluster. As Bcubed cannot be directly applied to this task, we propose a modified version of Bcubed that avoids the problems found with other metrics.
There is a wide set of evaluation metrics available to compare the quality of text clustering algorithms. In this article, we define a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families. These formal constraints are validated in an experiment involving human assessments, and compared with other constraints proposed in the literature. Our analysis of a wide range of metrics shows that only BCubed satisfies all formal constraints.We also extend the analysis to the problem of overlapping clustering, where items can simultaneously belong to more than one cluster. As Bcubed cannot be directly applied to this task, we propose a modified version of Bcubed that avoids the problems found with other metrics.
One of the most challenging aspects of Computer-Supported Collaborative Learning (CSCL) research is automation of collaboration and interaction analysis in order to understand and improve the learning processes. It is particularly necessary to look in more depth at the joint analysis of the collaborative process and its resulting product. In this article we present a framework for comprehensive analysis in CSCL synchronous environments supporting a problem solving approach to learning. This framework is based on an observation-abstraction-intervention analysis life-cycle and consists of a suite of analysis indicators, procedures for calculating indicators and a model of intervention based on indicators. Analysis indicators are used to represent the collaboration and knowledge building process at different levels of abstraction, and to characterize the solution built using models of the application domain, the problems to solve and their solutions. The analysis procedures combine analysis of actions and dialogue with analysis of the solution. In this way, the process and the solution are studied independently as well as together, enabling the detection of correlations between them. In order to exemplify and test the framework, the methodological process underlying the framework was followed to guide the implementation of the analysis subsystems of two existing CSCL environments. In addition, a number of studies have been conducted to evaluate the framework"s approach, demonstrating that certain modes of collaborating and working imply particular types of solutions and vice versa.
This paper describes the creation of a testbed to evaluate people searching strategies on the World-Wide-Web. This task involves resolving person names' ambiguity and locating relevant information characterising every individual under the same name.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.