Unsupervised document classification using sequential information maximization

Slonim, Noam; Friedman, Nir; Tishby, Naftali

doi:10.1145/564376.564401

Cited by 177 publications

(129 citation statements)

References 13 publications

Supporting

Mentioning

125

Contrasting

Unclassified

Order By: Relevance

“…The work in [72] uses a partially supervised EM-algorithm which iteratively assigns labels to the unlabeled documents and refines them over time as convergence is achieved. A number of similar methods along this spirit are proposed in [4,14,35,47,89] with varying levels of supervision in the clustering process. Partially supervised clustering methods are also used feature transformation in classification using the methods as discussed in [17,18,88].…”

Section: Semi-supervised Clusteringmentioning

confidence: 99%

A Survey of Text Clustering Algorithms

2012

View full text Add to dashboard Cite

Clustering is a widely studied data mining problem in the text domains. The problem finds numerous applications in customer segmentation, classification, collaborative filtering, visualization, document organization, and indexing. In this chapter, we will provide a detailed survey of the problem of text clustering. We will study the key challenges of the clustering problem, as it applies to the text domain. We will discuss the key methods used for text clustering, and their relative advantages. We will also discuss a number of recent advances in the area in the context of social network and linked data.

show abstract

Section: Semi-supervised Clusteringmentioning

confidence: 99%

A Survey of Text Clustering Algorithms

2012

View full text Add to dashboard Cite

show abstract

“…As in training, the 3x3, 5x5 and 10x10 maps were all tested. To analyse the results of categorisation between the topographic maps we utilised techniques from conventional text-based categorization analysis including: Precision [50], the Jacaard or JAC method [51], and the Fowlkes-Mallows or FM method [52]. Since classification is unsupervised it is not possible to apply these evaluation methods directly as would be the case for supervised learning.…”

Section: Testingmentioning

confidence: 99%

“…For this reason, the labels (architects) identified from training are maintained so as to assign categories. The "micro-averaged" precision matrix method [50] was first used to evaluate each network and the well-established JAC and FM methods were then used to evaluate cluster quality; see [40] for further details of these evaluation methods.…”

Section: Testingmentioning

confidence: 99%

Let’s Look at Style: Visual and Spatial Representation and Reasoning in Design

Jupp

Gero

2010

The Structure of Style

View full text Add to dashboard Cite

Abstract. This chapter explores the perception and modeling of style in design relating to visuo-spatial representation and reasoning. We approach this subject via cognitive and contextual considerations significant to the role of style during designing. A designer's ability to represent and reason about design artifacts visually and spatially allows meaningful 'chunks' of design information to be utilized relative to the designer's task and context. Central to cognitive and contextual notions of style are two issues, namely the level of semantic interpretation, and the comparative method's degree of contextual sensitivity. This compound problem requires some explicit and cognitively plausible ordering principle and adaptive measure capable of allowing for dependencies in reasoning about similarities. This chapter first investigates the perception of style in relation to these modeling requirements before demonstrating and testing their implementation. We then discuss style in relation to design tasks and how they can be supported via the classification and retrieval of designs from large databases of visuo-spatial information.

show abstract

“…Each sequence X n is generated by some unknown random process P X|Y , uniquely determined by its label Y . Assuming that each element of X n lies in a discrete and finite set X , its empirical distribution (or type [11]) is defined as the pmfPXn (x) = n −1 P n i=1 1(Xi = x), i.e., it results from counting the number of occurences of each symbol 3 x of X in X n , and is an approximation of the true process 4 . We now have the following Problem Formulation: Given L = |Y|, we want to find a partition A1, .…”

Section: Preliminaries and Problem Formulationmentioning

confidence: 99%

“…Experimental results from a benchmark task of document categorization from the "20 Newsgroups" corpus [10] show that ISPDTs, combined with Jensen-Rényi divergences and "strapping", are competitive with, and in most cases outperform, the sequential information bottleneck procedure [3], which is considered the state-of-the-art in unsupervised document categorization. The paper is organized as follows.…”

Section: Introductionmentioning

confidence: 99%

Iterative Denoising using Jensen-Renyi Divergences with an Application to Unsupervised Document Categorization

Karakos

Khudanpur

Eisner

et al. 2007

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07

View full text Add to dashboard Cite

Iterative denoising trees were used by Karakos et al. [1] for unsupervised hierarchical clustering. The tree construction involves projecting the data onto low-dimensional spaces, as a means of smoothing their empirical distributions, as well as splitting each node based on an information-theoretic maximization objective. In this paper, we improve upon the work of [1] in two ways: (i) the amount of computation spent searching for a good projection at each node now adapts to the intrinsic dimensionality of the data observed at that node; (ii) the objective at each node is to find a split which maximizes a generalized form of mutual information, the Jensen-Rényi divergence; this is followed by an iterative Naïve Bayes classification. The single parameter α of the Jensen-Rényi divergence is chosen based on the "strapping" methodology [2], which learns a meta-classifer on a related task. Compared with the sequential Information Bottleneck method [3], our procedure produces state-of-the-art results on an unsupervised categorization task of documents from the "20 Newsgroups" dataset.

show abstract

Unsupervised document classification using sequential information maximization

Cited by 177 publications

References 13 publications

A Survey of Text Clustering Algorithms

A Survey of Text Clustering Algorithms

Let’s Look at Style: Visual and Spatial Representation and Reasoning in Design

Iterative Denoising using Jensen-Renyi Divergences with an Application to Unsupervised Document Categorization

Contact Info

Product

Resources

About