Two information-theoretic tools to assess the performance of multi-class classifiers

2018

Entropy

Self Cite

Data transformation, e.g., feature transformation and selection, is an integral part of any machine learning procedure. In this paper, we introduce an information-theoretic model and tools to assess the quality of data transformations in machine learning tasks. In an unsupervised fashion, we analyze the transformation of a discrete, multivariate source of information X into a discrete, multivariate sink of information Y related by a distribution P XY . The first contribution is a decomposition of the maximal potential entropy of (X, Y), which we call a balance equation, into its (a) non-transferable, (b) transferable, but not transferred, and (c) transferred parts. Such balance equations can be represented in (de Finetti) entropy diagrams, our second set of contributions. The most important of these, the aggregate channel multivariate entropy triangle, is a visual exploratory tool to assess the effectiveness of multivariate data transformations in transferring information from input to output variables. We also show how these decomposition and balance equations also apply to the entropies of X and Y, respectively, and generate entropy triangles for them. As an example, we present the application of these tools to the assessment of information transfer efficiency for Principal Component Analysis and Independent Component Analysis as unsupervised feature transformation and selection procedures in supervised classification tasks.

show abstract

Section: (B)mentioning

confidence: 99%

Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

2018

Entropy

Self Cite

show abstract

“…The entropy triangle is a contingency matrix visualization tool based on an often overlooked decomposition of the joint entropy of two random variables [4]. Figure 1 shows such a decomposition showing the three crucial regions: -The mutual information,…”

Section: The Entropy Triangle: a Visualization Toolmentioning

confidence: 99%

“…Those at or close to the right vertex are not doing any job on very easy data for which they claim to have very high accuracy: they are specialized (majority) classifier s and our intuition is that they are the kind of classifiers that generate the accuracy paradox [1]. In just this guise, the ET has already been successfully used in the evaluation of Speech Recognition systems [4,7]. But a simple extension of the ET is to endow it with a graduated axis or colormap that also allows us to visualize the correlation of such information-theoretic measures with other measures like accuracy, greatly enhancing its usefulness.…”

Section: Et)mentioning

confidence: 99%

“…In [4] an information-theoretic visualization scheme was proposed, FJVA and JCdA are supported by EU FP7 project LiMoSINe (contract 288024). CPM has been partially supported by the Spanish Government-Comisión Interministerial de Ciencia y Tecnología project TEC2011-26807 for this paper.…”

Section: Introductionmentioning

confidence: 99%

“…Furthermore, it is actually one aspect of a tripolar manifestation [4], hence not adequate as a binary indicator of goodness. Also, it measures how well has the classifier learnt the input distribution, but not what its expected accuracy is.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A Proposal for New Evaluation Metrics and Result Visualization Technique for Sentiment Analysis Tasks

Carrillo-de-Albornoz

Lecture Notes in Computer Science

2013

Self Cite

Abstract. In this paper we propound the use of a number of entropybased metrics and a visualization tool for the intrinsic evaluation of Sentiment and Reputation Analysis tasks. We provide a theoretical justification for their use and discuss how they complement other accuracybased metrics. We apply the proposed techniques to the analysis of TASS-SEPLN and RepLab 2012 results and show how the metric is effective for system comparison purposes, for system development and postmortem evaluation.

show abstract

The Multivariate Entropy Triangle and Applications

Lecture Notes in Computer Science

2016