Modern weakly supervised methods for event detection (ED) avoid time-consuming human annotation and achieve promising results by learning from auto-labeled data. However, these methods typically rely on sophisticated pre-defined rules as well as existing instances in knowledge bases for automatic annotation and thus suffer from low coverage, topic bias, and data noise. To address these issues, we build a large event-related candidate set with good coverage and then apply an adversarial training mechanism to iteratively identify those informative instances from the candidate set and filter out those noisy ones. The experiments on two real-world datasets show that our candidate selection and adversarial training can cooperate together to obtain more diverse and accurate training data for ED, and significantly outperform the state-of-theart methods in various weakly supervised scenarios. The datasets and source code can be obtained from https://github.com/ thunlp/Adv-ED.
We propose a hierarchical distributed algorithm to solve optimal power flow (OPF) problems that aim at dispatching controllable distributed energy resources (DERs) for voltage regulation at minimum cost. The proposed algorithm features unprecedented scalability to large multi-phase distribution networks by jointly exploring the tree/subtrees structure of a large radial distribution network and the structure of the linearized distribution power flow (LinDistFlow) model to derive a hierarchical, distributed implementation of the primal-dual gradient algorithm that solves OPF. The proposed implementation significantly reduces the computation loads compared to the centrally coordinated implementation of the same primal-dual algorithm without compromising optimality. Numerical results on a 4,521node test feeder show that the designed algorithm achieves more than 10-fold acceleration in the speed of convergence compared to the centrally coordinated primal-dual algorithm through reducing and distributing computational loads.
Recently, pre-trained language models mostly follow the pre-train-then-fine-tuning paradigm and have achieved great performance on various downstream tasks. However, since the pretraining stage is typically task-agnostic and the fine-tuning stage usually suffers from insufficient supervised data, the models cannot always well capture the domain-specific and task-specific patterns. In this paper, we propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and finetuning. In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns. Specifically, we design a method to measure the importance of each token in sequences and selectively mask the important tokens. Experimental results on two sentiment analysis tasks show that our method can achieve comparable or even better performance with less than 50% of computation cost, which indicates our method is both effective and efficient. The source code of this paper can be obtained from https://github. com/thunlp/SelectiveMasking.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.