Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval 2012
DOI: 10.1145/2348283.2348411
|View full text |Cite
|
Sign up to set email alerts
|

A utility-theoretic ranking method for semi-automated text classification

Abstract: In Semi-Automated Text Classification (SATC) an automatic classifierΦ labels a set of unlabelled documents D, following which a human annotator inspects (and corrects when appropriate) the labels attributed byΦ to a subset D of D, with the aim of improving the overall quality of the labelling.An automated system can support this process by ranking the automatically labelled documents in a way that maximizes the expected increase in effectiveness that derives from inspecting D . An obvious strategy is to rank D… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 9 publications
(25 citation statements)
references
References 21 publications
0
25
0
Order By: Relevance
“…In that work, Berardi et al deployed a utilitytheoretic ranking approach for semi-automatic text classification [8]. Their approach ranks documents by the expected gain in accuracy that a classification system can achieve by having a reviewer correct mis-classified instances, i.e.…”
Section: Related Workmentioning
confidence: 99%
“…In that work, Berardi et al deployed a utilitytheoretic ranking approach for semi-automatic text classification [8]. Their approach ranks documents by the expected gain in accuracy that a classification system can achieve by having a reviewer correct mis-classified instances, i.e.…”
Section: Related Workmentioning
confidence: 99%
“…The U-Theoretic approach described in [2] tackles SATC in a "multi-label multi-class" context, i.e., there is a set of classes C = {c1, . .…”
Section: Satc Sensitivity Identificationmentioning
confidence: 99%
“…SATC (see [2,5,6]) is defined as the task of ranking a set D of automatically classified textual documents in such a way that, if a human annotator validates the documents in a top-ranked portion of D with the goal of increasing the overall classification accuracy of D, the expected increase in accuracy is maximized. Therefore, we envisage our annotators as validating documents by sensitivity, starting from the top of the ranked list we generate, and working downwards (until they are confident that the dataset has been cleared up, or until the budget for annotation work has been spent).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations