Xin Huang scite author profile

Cross-modal retrieval has become a highlighted research topic for retrieval across multimedia data such as image and text. A two-stage learning framework is widely adopted by most existing methods based on Deep Neural Network (DNN):The first learning stage is to generate separate representation for each modality, and the second learning stage is to get the cross-modal common representation. However, the existing methods have three limitations: (1) In the first learning stage, they only model intra-modality correlation, but ignore inter-modality correlation with rich complementary context. (2) In the second learning stage, they only adopt shallow networks with single-loss regularization, but ignore the intrinsic relevance of intra-modality and inter-modality correlation. (3) Only original instances are considered while the complementary fine-grained clues provided by their patches are ignored. For addressing the above problems, this paper proposes a cross-modal correlation learning (CCL) approach with multi-grained fusion by hierarchical network, and the contributions are as follows: (1) In the first learning stage, CCL exploits multi-level association with joint optimization to preserve the complementary context from intra-modality and inter-modality correlation simultaneously. (2) In the second learning stage, a multi-task learning strategy is designed to adaptively balance the intra-modality semantic category constraints and inter-modality pairwise similarity constraints. (3) CCL adopts multi-grained modeling, which fuses the coarse-grained instances and fine-grained patches to make cross-modal correlation more precise. Comparing with 13 state-of-the-art methods on 6 widelyused cross-modal datasets, the experimental results show our CCL approach achieves the best performance.

show abstract

Best Attainable Rates of Convergence for Estimators of the Stable Tail Dependence Function

Drees¹,

Huang²

1998

Journal of Multivariate Analysis

111

104

View full text Add to dashboard Cite

It is well known that a bivariate distribution belongs to the domain of attraction of an extreme value distribution G if and only if the marginals belong to the domain of attraction of the univariate marginal extreme value distributions and the dependence function converges to the stable tail dependence function of G. Hall and Welsh (1984, Ann. Statist. 12, 1079 1084) and Drees (1997b, Ann. Statist., to appear) addressed the problem of finding optimal rates of convergence for estimators of the extreme value index of an univariate distribution. The present paper deals with the corresponding problem for the stable tail dependence function. First an upper bound on the rate of convergence for estimators of the stable tail dependence function is established. Then it is shown that this bound is sharp by proving that it is attained by the tail empirical dependence function. Finally, we determine the limit distribution of this estimator if the dependence function satisfies a certain second-order condition.1998 Academic Press AMS 1991 subject classifications: primary 62G05, 62H12; secondary 62G30, 62G35.Key words and phrases: asymptotic normality, bivariate extreme value distribution, domain of attraction, rate of convergence, stable tail dependence function, tail empirical dependence function.

show abstract

An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges

Peng

Huang

Zhao

2018

IEEE Trans. Circuits Syst. Video Technol.

280

View full text Add to dashboard Cite

Multimedia retrieval plays an indispensable role in big data utilization. Past efforts mainly focused on single-media retrieval. However, the requirements of users are highly flexible, such as retrieving the relevant audio clips with one query of image. So challenges stemming from the "media gap", which means that representations of different media types are inconsistent, have attracted increasing attention. Cross-media retrieval is designed for the scenarios where the queries and retrieval results are of different media types. As a relatively new research topic, its concepts, methodologies and benchmarks are still not clear in the literatures. To address these issues, we review more than 100 references, give an overview including the concepts, methodologies, major challenges and open issues, as well as build up the benchmarks including datasets and experimental results. Researchers can directly adopt the benchmarks to promptly evaluate their proposed methods. This will help them to focus on algorithm design, rather than the time-consuming compared methods and results. It is noted that we have constructed a new dataset XMedia, which is the first publicly available dataset with up to five media types (text, image, video, audio and 3D model). We believe this overview will attract more researchers to focus on cross-media retrieval and be helpful to them.

show abstract

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval

Huang

Peng

Yuan

2020

IEEE Trans. Cybern.

108

View full text Add to dashboard Cite

Cross-modal retrieval has drawn wide interest for retrieval across different modalities of data (such as text, image, video, audio and 3D model). However, existing methods based on deep neural network (DNN) often face the challenge of insufficient cross-modal training data, which limits the training effectiveness and easily leads to overfitting. Transfer learning is usually adopted for relieving the problem of insufficient training data, but it mainly focuses on knowledge transfer only from large-scale datasets as single-modal source domain (such as ImageNet) to single-modal target domain. In fact, such large-scale single-modal datasets also contain rich modal-independent semantic knowledge that can be shared across different modalities. Besides, large-scale cross-modal datasets are very labor-consuming to collect and label, so it is significant to fully exploit the knowledge in singlemodal datasets for boosting cross-modal retrieval. To achieve this goal, this paper proposes modal-adversarial hybrid transfer network (MHTN), which to the best of our knowledge is the first work to realize knowledge transfer from single-modal source domain to cross-modal target domain, and learn cross-modal common representation. It is an end-to-end architecture with two subnetworks: (1) Modal-sharing knowledge transfer subnetwork is proposed to jointly transfer knowledge from a large-scale singlemodal dataset in source domain to all modalities in target domain with a star network structure, which distills modal-independent supplementary knowledge for promoting cross-modal common representation learning. (2) Modal-adversarial semantic learning subnetwork is proposed to construct an adversarial training mechanism between common representation generator and modality discriminator, making the common representation discriminative for semantics but indiscriminative for modalities to enhance crossmodal semantic consistency during transfer process. Comprehensive experiments on 4 widely-used datasets show its effectiveness and generality.

show abstract

FoxO mediates APP-induced AICD-dependent cell death

Wang

Chen

et al. 2014

Cell Death Dis

View full text Add to dashboard Cite

The amyloid precursor protein (APP) is a broadly expressed transmembrane protein that has a significant role in the pathogenesis of Alzheimer's disease (AD). APP can be cleaved at multiple sites to generate a series of fragments including the amyloid β (Aβ) peptides and APP intracellular domain (AICD). Although Aβ peptides have been proposed to be the main cause of AD pathogenesis, the role of AICD has been underappreciated. Here we report that APP induces AICD-dependent cell death in Drosophila neuronal and non-neuronal tissues. Our genetic screen identified the transcription factor forkhead box O (FoxO) as a crucial downstream mediator of APP-induced cell death and locomotion defect. In mammalian cells, AICD physically interacts with FoxO in the cytoplasm, translocates with FoxO into the nucleus upon oxidative stress, and promotes FoxO-induced transcription of pro-apoptotic gene Bim. These data demonstrate that APP modulates FoxO-mediated cell death through AICD, which acts as a transcriptional co-activator of FoxO.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xin Huang

CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network

Best Attainable Rates of Convergence for Estimators of the Stable Tail Dependence Function

An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval

FoxO mediates APP-induced AICD-dependent cell death

Contact Info

Product

Resources

About