2015
DOI: 10.48550/arxiv.1511.03339
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Attention to Scale: Scale-aware Semantic Image Segmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
39
1

Year Published

2016
2016
2020
2020

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 40 publications
(40 citation statements)
references
References 3 publications
0
39
1
Order By: Relevance
“…Evaluation metric: The standard intersection over union (IOU) criterion and pixelwise accuracy are adopted for evaluation on PASCAL-Person-Part dataset and Horse-Cow parsing dataset, following [21][31] [9]. We use the same evaluation metrics as in [4] [22][36] for evaluation on two human parsing datasets, including accuracy, average precision, average recall, and average F-1 score.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Evaluation metric: The standard intersection over union (IOU) criterion and pixelwise accuracy are adopted for evaluation on PASCAL-Person-Part dataset and Horse-Cow parsing dataset, following [21][31] [9]. We use the same evaluation metrics as in [4] [22][36] for evaluation on two human parsing datasets, including accuracy, average precision, average recall, and average F-1 score.…”
Section: Methodsmentioning
confidence: 99%
“…Compared to Tree-LSTM, Graph LSTM is more natural and general for 2D image processing with arbitrary graph topologies and adaptive updating schemes. Semantic Object Parsing: There has been increasing research interest on the semantic object parsing problem including the general object parsing [21][12] [30][20] [31], person part segmentation [9][8] and human parsing [23][4] [32][33][34] [35] [36]. To capture the rich structure information based on the advanced CNN architecture, one common way is the combination of CNNs and CRFs [11][6][37] [12], where the CNN outputs are treated as unary potentials while CRF further incorporates pairwise or higher order factors.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Incorporating multi-scale inputs State-of-art models on the PASCAL VOC 2012 leaderboard usually employ multiscale features (either multi-scale inputs [10,28,7] or features from intermediate layers of DCNN [31,19,5]). Motivated by this, we further combine our proposed discriminatively trained domain transform and the model of [7], yielding 76.3% performance on test set, 1.5% behind current best models [28] which jointly train CRF and DCNN [6] EdgeNet on BSDS500 We further evaluate the edge detection performance of our learned EdgeNet on the test set of BSDS500 [1]. We employ the standard metrics to evaluate edge detection accuracy: fixed contour threshold (ODS Fscore), per-image best threshold (OIS F-score), and average precision (AP).…”
Section: Models Pretrained With Ms-cocomentioning
confidence: 99%
“…As it has been widely confirmed that feeding multiple scales of an input image to networks with shared parameters are beneficial for accurately localizing objects of different scales in pixel labeling problems [15,8,14,32], we replicate the refined VGG network in the previous section three times, each responsible for one of the scales. An input image is resized to three different scales (s ∈ {1, 0.75, 0.5}).…”
Section: Multiscale Fusion With Attentional Weightsmentioning
confidence: 99%