2022
DOI: 10.1007/978-3-031-20074-8_26
|View full text |Cite
|
Sign up to set email alerts
|

A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 41 publications
0
10
0
Order By: Relevance
“…Through extensive experiments, our small model obtains 40.2% and 51.1% of mIoU scores, surpassing the previous multi-task learning baseline [10] by absolute +5.7% and +7.0% on COCOStuff-10K and DMS, respectively. For single-task learning on materials segmentation, i.e., on DMS, our model reaches state-of-the-art performance, having +8.1% mIoU gains compared to previous CNN counterpart [13]. To verify the practicability of our MATERobot for recognizing material categories in real-world scenarios, we conduct a task-oriented user study with six blindfolded participants.…”
Section: Introductionmentioning
confidence: 86%
See 4 more Smart Citations
“…Through extensive experiments, our small model obtains 40.2% and 51.1% of mIoU scores, surpassing the previous multi-task learning baseline [10] by absolute +5.7% and +7.0% on COCOStuff-10K and DMS, respectively. For single-task learning on materials segmentation, i.e., on DMS, our model reaches state-of-the-art performance, having +8.1% mIoU gains compared to previous CNN counterpart [13]. To verify the practicability of our MATERobot for recognizing material categories in real-world scenarios, we conduct a task-oriented user study with six blindfolded participants.…”
Section: Introductionmentioning
confidence: 86%
“…Recently, Vision Transformer [11], [20] is proposed to utilize the self-attention operation in transformer layers to extract non-local features from a sequence of image patches, yielding an alternative backbone solution compared to convolutional counterparts [19], [18], [21]. In DMS [13], a model based on ResNet [21] is used to address dense material segmentation. In contrast to DMS [13], we adopt Vision Transformer as the backbone, e.g., ViT [11], to perform material semantic segmentation, with the aim of extracting long-distance dependencies between image patches, since the long-range contextual information is crucial for robust representation of diverse materials [10].…”
Section: B Materials Semantic Segmentationmentioning
confidence: 99%
See 3 more Smart Citations