2019 32nd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) 2019
DOI: 10.1109/sibgrapi.2019.00041
|View full text |Cite
|
Sign up to set email alerts
|

On Modeling Context from Objects with a Long Short-Term Memory for Indoor Scene Recognition

Abstract: Automatic scene recognition is still regarded as an open challenge, even though there are reports of outperforming human accuracy. This is specially true for indoor scenes, since they can be well represented by their composing objects, which is highly variable information. Objects vary in angle, size, texture, besides being often partially occluded on crowded scenes. Even though Convolutional Neural Networks showed remarkable performance for most image-related problems, for indoor scenes the top performances w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 51 publications
0
10
0
Order By: Relevance
“…To take fully advantage of back-propagation, scene representations are extracted from end-to-end trainable CNNs, like DAG-CNN [37], MFAFVNet [31], VSAD [32], G-MS2F [39], and DL-CNN [22]. To focus on main content of the scene, object detection is used to capture salient regions, such as MetaObject-CNN [33], WELDON [34], SDO [35], and BiLSTM [36]. Since features from multiple CNN layers or multiple views are complementary, many literatures [19], [21], [76] and GIST [77].…”
Section: A Road Map Of Scene Classification In 20 Yearsmentioning
confidence: 99%
See 3 more Smart Citations
“…To take fully advantage of back-propagation, scene representations are extracted from end-to-end trainable CNNs, like DAG-CNN [37], MFAFVNet [31], VSAD [32], G-MS2F [39], and DL-CNN [22]. To focus on main content of the scene, object detection is used to capture salient regions, such as MetaObject-CNN [33], WELDON [34], SDO [35], and BiLSTM [36]. Since features from multiple CNN layers or multiple views are complementary, many literatures [19], [21], [76] and GIST [77].…”
Section: A Road Map Of Scene Classification In 20 Yearsmentioning
confidence: 99%
“…[24], [40], [41] also explored their complementarity to improve performance. In addition, there exists many strategies (like attention mechanism, contextual modeling, multi-task learning with regularization terms) to enhance representation ability, such as CFA [24], BiLSTM [36], MAPNet [43], MSN [44], and LGN [45]. For datasets, because depth images from RGB-D cameras are not vulnerable to illumination changes, since 2015, researchers have started to explore RGB-D scene recognition.…”
Section: A Road Map Of Scene Classification In 20 Yearsmentioning
confidence: 99%
See 2 more Smart Citations
“…To utilize the object relation for scene recognition, spatial object-to-object relation is studied for RGB-D scene recognition [8]. Besides, a Long Short-Term Memory modeling method is proposed to investigate the object relation with ROI selection [9]. To utilize the object information in the scene, the object model [10] is proposed as complementary semantic information of the scene combined with ResNet to better interpret the given scene.…”
Section: Introductionmentioning
confidence: 99%