Abstract:Indoor scenes tend to be abundant with planar homogeneous texture, manifesting as regularly repeating scene elements along a plane. In this work, we propose to exploit such structure to facilitate high-level scene understanding. By robustly fitting a texture projection model to optimal dominant frequency estimates in image patches, we arrive at a projective-invariant method to localize such semantically meaningful regions in multi-planar scenes. The recovered projective parameters also allow an affine-ambiguou… Show more
“…Moreover, attention mechanism is widely used in other computer vision and natural language processing tasks [32,33,34,35]. RNN (1) RNN (2) RNN (1)…”
Section: Related Workmentioning
confidence: 99%
“…RNN (1) 1 t y − In this paper, we retain the attention part as the local image feature. To improve the performance of the language part, gated feedback connecting strategy is used for stacking the LSTM.…”
Section: Related Workmentioning
confidence: 99%
“…Image caption generation aims to automatically generate a natural language sentence to describe the content of a given image. It is a vital task of scene understanding which is one of the fundamental goals of computer vision and artificial intelligence [1,2]. However, image caption generation is a challenging task.…”
“…Moreover, attention mechanism is widely used in other computer vision and natural language processing tasks [32,33,34,35]. RNN (1) RNN (2) RNN (1)…”
Section: Related Workmentioning
confidence: 99%
“…RNN (1) 1 t y − In this paper, we retain the attention part as the local image feature. To improve the performance of the language part, gated feedback connecting strategy is used for stacking the LSTM.…”
Section: Related Workmentioning
confidence: 99%
“…Image caption generation aims to automatically generate a natural language sentence to describe the content of a given image. It is a vital task of scene understanding which is one of the fundamental goals of computer vision and artificial intelligence [1,2]. However, image caption generation is a challenging task.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.