2006
DOI: 10.1109/tpami.2006.186
|View full text |Cite
|
Sign up to set email alerts
|

Feature extraction using information-theoretic learning

Abstract: A classification system typically consists of both a feature extractor (preprocessor) and a classifier. These two components can be trained either independently or simultaneously. The former option has an implementation advantage since the extractor need only be trained once for use with any classifier, whereas the latter has an advantage since it can be used to minimize classification error directly. Certain criteria, such as Minimum Classification Error, are better suited for simultaneous training, whereas o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
79
0
1

Year Published

2009
2009
2023
2023

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 135 publications
(80 citation statements)
references
References 27 publications
0
79
0
1
Order By: Relevance
“…Information theoretic quantities have been widely used for feature extraction and selection, e.g. Fisher et al [26], Hild et al [27] or Sindhwani et al [28], who proposed a feature selection technique for support vector machines and neural networks. Recently, Butz et al [29] proposed to apply this framework to multi-modal signal processing.…”
Section: Motivationsmentioning
confidence: 99%
“…Information theoretic quantities have been widely used for feature extraction and selection, e.g. Fisher et al [26], Hild et al [27] or Sindhwani et al [28], who proposed a feature selection technique for support vector machines and neural networks. Recently, Butz et al [29] proposed to apply this framework to multi-modal signal processing.…”
Section: Motivationsmentioning
confidence: 99%
“…In contrast, our approach encodes maximization of MI as variational graph embedding, a much easier problem to solve. In addition, our approach is significantly different from the existing MI-based feature extraction algorithms such as [12,22,10], all of which are formalized as nonlinear non-convex optimization problems and do not account for the local properties of the data, as a result, cannot capture the geometric structure of the underlying manifold, which is, however, fundamental to feature learning as being demonstrated by recent researches [21,1].…”
Section: Related Workmentioning
confidence: 95%
“…which has been justified both theoretically and experimentally by many previous works, e.g., [16,22,10]. A kernel density estimator is then employed to estimate the density function involved in Eq.(3).…”
Section: Nonparametric Quadratic MImentioning
confidence: 97%
“…Due to the similarities of the underlying criterion, the proposed method retains the same designation, MRMI-SIG, used previously for feature extraction of static data [23] and instantaneous BSS [14]. There are four variables that must be chosen for this method.…”
Section: Proposed Criterionmentioning
confidence: 99%