2009
DOI: 10.1145/1497577.1497579
|View full text |Cite
|
Sign up to set email alerts
|

Semi-analytical method for analyzing models and model selection measures based on moment analysis

Abstract: In this article we propose a moment-based method for studying models and model selection measures. By focusing on the probabilistic space of classifiers induced by the classification algorithm rather than on that of datasets, we obtain efficient characterizations for computing the moments, which is followed by visualization of the resulting formulae that are too complicated for direct interpretation. By assuming the data to be drawn independently and identically distributed from the underlying probability dist… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 21 publications
(18 reference statements)
0
6
0
Order By: Relevance
“…observations (e.g. [3,4,6,14,15,22,23,30,32,33,37,39]). However, none of them address the problems of (1) dependence between related instances and (2) dependence between training and test set sizes.…”
Section: Both Of the Above Methodologies Generally Address The Statismentioning
confidence: 99%
“…observations (e.g. [3,4,6,14,15,22,23,30,32,33,37,39]). However, none of them address the problems of (1) dependence between related instances and (2) dependence between training and test set sizes.…”
Section: Both Of the Above Methodologies Generally Address The Statismentioning
confidence: 99%
“…We then revisit the generic expressions provided in [Dhurandhar and Dobra 2009] that need to be customized to specific classification algorithms; in this case an ensemble of random decision trees. Finally, we provide the customized expressions for random decision trees (not ensemble) of prespecified height and discuss the intuitions in its derivation.…”
Section: Preliminariesmentioning
confidence: 99%
“…Hence, having fewer variables will invariably lead to tighter bounds. If we were to derive bounds directly, we would have to consider the entire dataset with all the cells without collapsing or aggregation of cells into fewer cells and hence fewer variables [Dhurandhar and Dobra 2009], as is the case using the generic expressions. The reason for this is that the generic expressions consider cells corresponding to an individual input or pairs of inputs rather than the entire input space.…”
Section: Practical Considerationsmentioning
confidence: 99%
See 2 more Smart Citations