3G structure for image caption generation

Yuan, Aihong; Li, Xuelong; Lu, Xiaoqiang

doi:10.1016/j.neucom.2018.10.059

Cited by 32 publications

(6 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to obtain the complete semantic information of the video moving target, the local motion features of targets in the video image are extracted to ensure the effective behavior recognition of subsequent video targets [5,6]. In order to extract the local motion information of video object effectively, the global motion information must be estimated in advance, and the global motion information must be eliminated when analyzing the local target motion [7]. In this paper, a fast diamond search and matching method is used to perform the global motion extraction, and then the local dense motion vector field is deduced by optical flow analysis.…”

Section: Construction Of Parabolic Trajectorymentioning

confidence: 99%

Parabolic Detection Algorithm of Tennis Serve Based on Video Image Analysis Technology

Tang¹

2021

Security and Communication Networks

View full text Add to dashboard Cite

At present, the existing algorithm for detecting the parabola of tennis serves neglects the pre-estimation of the global motion information of tennis balls, which leads to great error and low recognition rate. Therefore, a new algorithm for detecting the parabola of tennis service based on video image analysis is proposed. The global motion information is estimated in advance, and the motion feature of the target is extracted. A tennis appearance model is established by sparse representation, and the data of high-resolution tennis flight appearance model are processed by data fusion technology to track the parabolic trajectory. Based on the analysis of the characteristics of the serve mechanics, according to the nonlinear transformation of the parabolic trajectory state vector, the parabolic trajectory starting point is determined, the parabolic trajectory is obtained, and the detection algorithm of the parabolic service is designed. Experimental results show that compared with the other two algorithms, the algorithm designed in this paper can recognize the trajectory of the parabola at different stages, and the detection accuracy of the parabola is higher in the three-dimensional space of the tennis service.

show abstract

Section: Construction Of Parabolic Trajectorymentioning

confidence: 99%

Parabolic Detection Algorithm of Tennis Serve Based on Video Image Analysis Technology

Tang¹

2021

Security and Communication Networks

View full text Add to dashboard Cite

show abstract

“…With the significant improvement of the computational power of the hardware equipments, deep learning based methods have shown its advantageous performances on many vision tasks such as video object tracking [38], [39], object detection [40], image captioning [41], [42]. Recently, deep learning based methods have been successfully applied to damage and distress detection tasks.…”

Section: Deep Learning Based Methods For Road Crack Detectionmentioning

confidence: 99%

Sample and Structure-Guided Network for Road Crack Detection

Fang

Zheng

et al. 2019

IEEE Access

View full text Add to dashboard Cite

As an indispensable task for traffic management department, road maintenance has attracted much attention during the last decade due to the rapid development of traffic network. As is known, crack is the early form of many road damages, and repair it in time can significantly save the maintenance cost. In this case, how to detect crack regions quickly and accurately becomes a huge demand. Actually, many image processing technique based methods have been proposed for crack detection, but their performances can not meet our expectations. The reason is that, most of these methods use bottom features such as color and texture to detect the cracks, which are easily influenced by the varied conditions such as light and shadow. Inspired by the great successes of machine learning and artificial intelligence, this paper presents a sample and structure guided network for detecting road cracks. Specifically, the proposed network is based on U-Net architecture, which remains the details from input to output by using skip connection strategy. Then, because the scale of crack samples is much smaller than that of non-crack ones, directly using the conventional cross entropy loss can not optimize the network effectively. In this case, the Focal loss is utilized to address the model optimization problem. Additionally, we incorporate the self-attention strategy into the proposed network, which enhances its stability by encoding the 2-order information among different local regions into the final features. Finally, we test the proposed method on four datasets, three public ones with labels and a photographed one without labels, to validate its effectiveness. It is noteworthy that, for the photographed dataset, we design a series of image processing strategies such as contrast enhancement to improve the generalization capability of the proposed method.

show abstract

“…When the value of min Scale is 0.4, the sample size changes gradually from 0.4-1 to 0.4. Based on the value of min Scale, pyramid shrinkage is applied to a given texture 2 Scientific Programming sample drawing to obtain a multilevel progressive shrinkage sample drawing, which is used as a scaling factor in the process of texture synthesis to obtain the basis for the corresponding sample drawing [11,12]. In order to make the neighboring triangle mesh correspond to the neighboring layer, based on max z, set up: the value of S function can only be different one layer within the sample layer if the difference is max z that is, according to max z to divide the value range of S function, the value of the function is divided into different layers according to max z, and the layers corresponding to the neighboring triangle mesh are also adjacent, thus ensuring the continuity of texture in the process of composition.…”

Section: 2mentioning

confidence: 99%

Animation Design Based on 3D Visual Communication Technology

Shan

Wang

2022

Scientific Programming

View full text Add to dashboard Cite

The depth synthesis of image texture is neglected in the current image visual communication technology, which leads to the poor visual effect. Therefore, the design method of film and TV animation based on 3D visual communication technology is proposed. Collect film and television animation videos through 3D visual communication content production, server processing, and client processing. Through stitching, projection mapping, and animation video image frame texture synthesis, 3D vision conveys animation video image projection. In order to ensure the continuous variation of scaling factors between adjacent triangles of animation and video images, the scaling factor field is constructed. Deep learning is used to extract the deep features and to reconstruct the multiframe animated and animated video images based on visual communication. Based on this, the frame feature of video image under gray projection is identified and extracted, and the animation design based on 3D visual communication technology is completed. Experimental results show that the proposed method can enhance the visual transmission of animation video images significantly and can achieve high-precision reconstruction of video images in a short time.

show abstract

3G structure for image caption generation

Abstract: 1 It is a big challenge of computer vision to make machine automatically describe

Cited by 32 publications

References 30 publications

Parabolic Detection Algorithm of Tennis Serve Based on Video Image Analysis Technology

Parabolic Detection Algorithm of Tennis Serve Based on Video Image Analysis Technology

Sample and Structure-Guided Network for Road Crack Detection

Animation Design Based on 3D Visual Communication Technology

Contact Info

Product

Resources

About