2019
DOI: 10.1016/j.neucom.2018.10.059
|View full text |Cite
|
Sign up to set email alerts
|

3G structure for image caption generation

Abstract: 1 It is a big challenge of computer vision to make machine automatically describe

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 32 publications
(6 citation statements)
references
References 30 publications
0
6
0
Order By: Relevance
“…In order to obtain the complete semantic information of the video moving target, the local motion features of targets in the video image are extracted to ensure the effective behavior recognition of subsequent video targets [5,6]. In order to extract the local motion information of video object effectively, the global motion information must be estimated in advance, and the global motion information must be eliminated when analyzing the local target motion [7]. In this paper, a fast diamond search and matching method is used to perform the global motion extraction, and then the local dense motion vector field is deduced by optical flow analysis.…”
Section: Construction Of Parabolic Trajectorymentioning
confidence: 99%
“…In order to obtain the complete semantic information of the video moving target, the local motion features of targets in the video image are extracted to ensure the effective behavior recognition of subsequent video targets [5,6]. In order to extract the local motion information of video object effectively, the global motion information must be estimated in advance, and the global motion information must be eliminated when analyzing the local target motion [7]. In this paper, a fast diamond search and matching method is used to perform the global motion extraction, and then the local dense motion vector field is deduced by optical flow analysis.…”
Section: Construction Of Parabolic Trajectorymentioning
confidence: 99%
“…With the significant improvement of the computational power of the hardware equipments, deep learning based methods have shown its advantageous performances on many vision tasks such as video object tracking [38], [39], object detection [40], image captioning [41], [42]. Recently, deep learning based methods have been successfully applied to damage and distress detection tasks.…”
Section: Deep Learning Based Methods For Road Crack Detectionmentioning
confidence: 99%
“…When the value of min Scale is 0.4, the sample size changes gradually from 0.4-1 to 0.4. Based on the value of min Scale, pyramid shrinkage is applied to a given texture 2 Scientific Programming sample drawing to obtain a multilevel progressive shrinkage sample drawing, which is used as a scaling factor in the process of texture synthesis to obtain the basis for the corresponding sample drawing [11,12]. In order to make the neighboring triangle mesh correspond to the neighboring layer, based on max z, set up: the value of S function can only be different one layer within the sample layer if the difference is max z that is, according to max z to divide the value range of S function, the value of the function is divided into different layers according to max z, and the layers corresponding to the neighboring triangle mesh are also adjacent, thus ensuring the continuity of texture in the process of composition.…”
Section: 2mentioning
confidence: 99%