2020
DOI: 10.48550/arxiv.2006.15864
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Ordinal Regression with Label Diversity

Axel Berg,
Magnus Oskarsson,
Mark O'Connor

Abstract: Regression via classification (RvC) is a common method used for regression problems in deep learning, where the target variable belongs to a set of continuous values. By discretizing the target into a set of non-overlapping classes, it has been shown that training a classifier can improve neural network accuracy compared to using a standard regression approach. However, it is not clear how the set of discrete classes should be chosen and how it affects the overall solution. In this work, we propose that using … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 28 publications
0
4
0
Order By: Relevance
“…Several deep learning approaches for ordinal regression were proposed in the recent years. A common approach seems to be to turn the ordinal regression problem into a multi-label classification problem, for example [Fu et al, 2018;Liu et al, 2017Liu et al, , 2018bTV et al, 2019;Berg et al, 2020;Cheng et al, 2008]. We argue that the multi-label approach has two major problematic aspects: first, the output probabilities are not always guaranteed to be consistent, in the sense of decreasing cumulative distribution (that is we would like to predict Pr(y ≥ 1) ≥ Pr(y ≥ 2) ≥ .…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Several deep learning approaches for ordinal regression were proposed in the recent years. A common approach seems to be to turn the ordinal regression problem into a multi-label classification problem, for example [Fu et al, 2018;Liu et al, 2017Liu et al, , 2018bTV et al, 2019;Berg et al, 2020;Cheng et al, 2008]. We argue that the multi-label approach has two major problematic aspects: first, the output probabilities are not always guaranteed to be consistent, in the sense of decreasing cumulative distribution (that is we would like to predict Pr(y ≥ 1) ≥ Pr(y ≥ 2) ≥ .…”
Section: Related Workmentioning
confidence: 99%
“…Several works propose the training objective to be cross entropy, while using ont-hot (or binary) targets, see, for example [Belharbi et al, 2019], [Vargas et al, 2020], Fu et al [2018]; Beckham and Pal [2017]; Berg et al [2020]; TV et al [2019]. As pointed out in several papers, and will also be demonstrated in section 3, in the case of one-hot targets, the cross entropy term equals the negative log of the probability assigned by the model to the true class, making it invariant to the distribution of the remaining probability mass.…”
Section: Related Workmentioning
confidence: 99%
“…It contains over 20,000 images and the age labels are per year level from 0 to 116. we trained and evaluated with this dataset. We split 20% of the data as test data similar to previous methods [2,4,17], but we used from people of 0 to 100 years old as training data nonetheless they trained and evaluated with the people of 21 to 60 years old. However, we tested 21 to 60 years old people to compare with these methods.…”
Section: Datasetsmentioning
confidence: 99%
“…In recent years, convolutional neural networks (CNN) based methods achieved great success in age estimation [42,24,4,31,17,2]. Although their methods are varied from each other, most of their research target is to estimate from images like pictures for identification certificates in which only one face exists.…”
Section: Introductionmentioning
confidence: 99%