Multi-modal pre-training models have been intensively explored to bridge vision and language in recent years. However, most of them explicitly model the cross-modal interaction between image-text pairs, by assuming that there exists strong semantic correlation between the text and image modalities. Since this strong assumption is often invalid in real-world scenarios, we choose to implicitly model the cross-modal correlation for large-scale multi-modal pretraining, which is the focus of the Chinese project 'Wen-Lan' led by our team. Specifically, with the weak correlation assumption over image-text pairs, we propose a twotower pre-training model called BriVL within the crossmodal contrastive learning framework. Unlike OpenAI CLIP that adopts a simple contrastive learning method, we devise a more advanced algorithm by adapting the latest method MoCo into the cross-modal scenario. By building a large queue-based dictionary, our BriVL can incorporate more negative samples in limited GPU resources. We further construct a large Chinese multi-source imagetext dataset called RUC-CAS-WenLan for pre-training our BriVL model. Extensive experiments demonstrate that the pre-trained BriVL model outperforms both UNITER and OpenAI CLIP on various downstream tasks.
Text generation has become one of the most important yet challenging tasks in natural language processing (NLP). The resurgence of deep learning has greatly advanced this field by neural generation models, especially the paradigm of pretrained language models (PLMs). In this paper, we present an overview of the major advances achieved in the topic of PLMs for text generation. As the preliminaries, we present the general task definition and briefly describe the mainstream architectures of PLMs for text generation. As the core content, we discuss how to adapt existing PLMs to model different input data and satisfy special properties in the generated text. We further summarize several important fine-tuning strategies for text generation. Finally, we present several future directions and conclude this paper. Our survey aims to provide text generation researchers a synthesis and pointer to related research.
With the construction and promotion of the Ubiquitous Power Internet of Things (UPIoT), it is an increasingly urgent challenge to comprehensively improve the recognition accuracy of the gasinsulated switchgear (GIS) partial discharge (PD), and to incorporate the model into UPIoT intelligent terminals supported by edge computing in embedded systems. Therefore, this paper proposes a novel MobileNets convolutional neural network (MCNN) model to identify the GIS PD patterns. We first construct the PD pattern recognition classification datasets by means of experiments and FDTD simulation, and also preprocess images via binarization processing. After constructing the MCNN model, depthwise separable convolutions and an inverse residual structure are adopted to deal with the vanishing gradient of the deep convolutional neural network (DCNN) in the GIS PD pattern recognition process. Then, through the graphics standardization process, the MCNN model is trained and tested. The whole training process is visualized by Tensorboard. Compared with other deep learning models and traditional machine learning methods, MCNN particularly stands out in recognition accuracy and time consumption with a 96.5% overall recognition rate and merely 7.3 seconds in training time. This research explores how to optimize the model by improving the recognition accuracy, and by reducing its computing load, storage space and energy consumption for better incorporation into intelligent terminals in the UPIoT context. INDEX TERMS Gas-insulated switchgear, mobilenets convolutional neural network model, partial discharge, pattern recognition, ubiquitous power Internet of Things.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.