Multi-modal pre-training models have been intensively explored to bridge vision and language in recent years. However, most of them explicitly model the cross-modal interaction between image-text pairs, by assuming that there exists strong semantic correlation between the text and image modalities. Since this strong assumption is often invalid in real-world scenarios, we choose to implicitly model the cross-modal correlation for large-scale multi-modal pretraining, which is the focus of the Chinese project 'Wen-Lan' led by our team. Specifically, with the weak correlation assumption over image-text pairs, we propose a twotower pre-training model called BriVL within the crossmodal contrastive learning framework. Unlike OpenAI CLIP that adopts a simple contrastive learning method, we devise a more advanced algorithm by adapting the latest method MoCo into the cross-modal scenario. By building a large queue-based dictionary, our BriVL can incorporate more negative samples in limited GPU resources. We further construct a large Chinese multi-source imagetext dataset called RUC-CAS-WenLan for pre-training our BriVL model. Extensive experiments demonstrate that the pre-trained BriVL model outperforms both UNITER and OpenAI CLIP on various downstream tasks.
Traditional gene selection methods for microarray data mainly considered the features' relevance by evaluating their utility for achieving accurate predication or exploiting data variance and distribution, and the selected genes were usually poorly explicable. To improve the interpretability of the selected genes as well as prediction accuracy, an improved gene selection method based on binary particle swarm optimization (BPSO) and prior information is proposed in this paper. In the proposed method, BPSO encoding gene-to-class sensitivity (GCS) information is used to perform gene selection. The gene-to-class sensitivity information, extracted from the samples by extreme learning machine (ELM), is encoded into the selection process in four aspects: initializing particles, updating the particles, modifying maximum velocity, and adopting mutation operation adaptively. Constrained by the gene-to-class sensitivity information, the new method can select functional gene subsets which are significantly sensitive to the samples' classes. With the few discriminative genes selected by the proposed method, ELM, K-nearest neighbor and support vector machine classifiers achieve much high prediction accuracy on five public microarray data, which in turn verifies the efficiency and effectiveness of the proposed gene selection method.
Growing evidence indicates that autism spectrum disorder (ASD) is a neuropsychological disconnection syndrome that can be analyzed using various complex network metrics used as pathology biomarkers. Recently, community detection and analysis rooted in the complex network and graph theories have been introduced to investigate the changes in resting-state functional network community structure under neurological pathologies. However, the potential of hidden patterns in the modular organization of networks derived from resting-state functional magnetic resonance imaging to predict brain pathology has never been investigated. In this study, we present a novel analysis technique to identify alterations in community patterns in functional networks under ASD. In addition, we design machine learning classifiers to predict the clinical class of patients with ASD and controls by using only community pattern quality metrics as features. Analyses conducted on six publicly available datasets from 235 subjects, including patients with ASD and age-matched controls revealed that the modular structure is significantly disturbed in patients with ASD. Machine learning algorithms showed that the predictive power of our five metrics is relatively high (~85.16% peak accuracy for in-site data and ~75.00% peak accuracy for multisite data). These results lend further credence to the dysconnectivity theory of this pathology.
Content-based image retrieval (CBIR) has been an active research area in the last ten years, and a variety of techniques have been developed. However, retrieving images on the basis of low-level features has proven unsatisfactory, and new techniques are needed to support high-level queries. Research efforts are needed to bridge the gap between high-level semantics and low-level features. In this paper, we present a novel approach to support semantics-based image retrieval. Our approach is based on the monotonic tree, a derivation of the contour tree for use with discrete data. The structural elements of an image are modeled as branches (or subtrees) of the monotonic tree. These structural elements are classified and clustered on the basis of such properties as color, spatial location, harshness and shape. Each cluster corresponds to some semantic feature. This scheme is applied to the analysis and retrieval of scenery images. Comparisons of experimental results of this approach with conventional techniques using low-level features demonstrate the effectiveness of our approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.