Haoyang Yu scite author profile

Haoyang Yu

1Publication

2Citation Statements Received

32Citation Statements Given

How they've been cited

How they cite others

Affiliations

Xiangtan University

Publications

Order By: Most citations

Natural Language Description Generation Method of Intelligent Image Internet of Things Based on Attention Mechanism

Ouyang

2022

Security and Communication Networks

View full text Add to dashboard Cite

With the rapid development of Internet of Things technology, the image data on the Internet are growing at an amazing speed. How to describe the semantic content of massive image data is facing great challenges. Attentional mechanisms originate from the study of human vision. In cognitive science, due to bottlenecks in information processing, humans selectively attend to a portion of all information while ignoring the rest of the visible information. This study mainly discusses the natural language description generation method of Internet of Things intelligent image based on attention mechanism. In this study, a CMOS sensor based on Internet of Things technology is used for image data acquisition and display. FPGA samples cis16bit parallel port data, writes FIFO, stores image data, and then transmits it to host computer for display through network interface. In order to minimize the value of cross-entropy loss function, maximum-likelihood estimation is used to maximize the joint probability of word sequences in the language model when sentence descriptions are generated using the encoder-decoder framework. At each moment, in addition to image features, additional text features are input. Image feature vector and text feature vector are weighted and summed by attention mechanism at each time. In decoding, the attention mechanism gives each image region feature weight, and the long-term and short-term memory network decodes in turn, but the long-term and short-term memory network has limited decoding ability. We use bidirectional long-term and short-term memory network instead of long-term memory network, and dynamically focus on context information through forward LSTM and reverse LSTM. The specificity of the proposed network is 5% higher than that of the 3D convolution residual link network. The results show that the performance of image description model is improved by inputting image context and text context into long-term memory network decoder.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Haoyang Yu

Natural Language Description Generation Method of Intelligent Image Internet of Things Based on Attention Mechanism

Contact Info

Product

Resources

About