Pre-trained contextual representations (e.g., BERT) have become the foundation to achieve state-of-the-art results on many NLP tasks. However, large-scale pretraining is computationally expensive. ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator. Our studies reveal that ELECTRA's success is mainly due to its reduced complexity of the pre-training task: the binary classification (replaced token detection) is more efficient to learn than the generation task (masked language modeling). However, such a simplified task is less semantically informative. To achieve better efficiency and effectiveness, we propose a novel meta-learning framework, MC-BERT. The pre-training task is a multi-choice cloze test with a reject option, where a meta controller network provides training input and candidates. Results over GLUE natural language understanding benchmark demonstrate that our proposed method is both efficient and effective: it outperforms baselines on GLUE semantic tasks given the same computational budget. * Equal contribution. Works done while interning at Microsoft Research Asia. 2 In BERT, among all tokens to be predicted, 80% of tokens are replaced by the [MASK] token, 10% of tokens are replaced by a random token, and 10% of tokens are unchanged.Preprint. Under review.
ABSTRACT:Many state-of-the-art image matching methods, based on the feature matching, have been widely studied in the remote sensing field. These methods of feature matching which get highly operating efficiency, have a disadvantage of low accuracy and robustness. This paper proposes an improved image matching method which based on the SURF algorithm. The proposed method introduces color invariant transformation, information entropy theory and a series of constraint conditions to increase feature points detection and matching accuracy. First, the model of color invariant transformation is introduced for two matching images aiming at obtaining more color information during the matching process and information entropy theory is used to obtain the most information of two matching images. Then SURF algorithm is applied to detect and describe points from the images. Finally, constraint conditions which including Delaunay triangulation construction, similarity function and projective invariant are employed to eliminate the mismatches so as to improve matching precision. The proposed method has been validated on the remote sensing images and the result benefits from its high precision and robustness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.