Attention mechanisms have been found effective for person re-identification (Re-ID). However, the learned "attentive" features are often not naturally uncorrelated or "diverse", which compromises the retrieval performance based on the Euclidean distance. We advocate the complementary powers of attention and diversity for Re-ID, by proposing an Attentive but Diverse Network (ABD-Net). ABD-Net seamlessly integrates attention modules and diversity regularizations throughout the entire network to learn features that are representative, robust, and more discriminative. Specifically, we introduce a pair of complementary attention modules, focusing on channel aggregation and position awareness, respectively. Then, we plug in a novel orthogonality constraint that efficiently enforces diversity on both hidden activations and weights. Through an extensive set of ablation study, we verify that the attentive and diverse terms each contributes to the performance boosts of ABD-Net. It consistently outperforms existing state-of-the-art methods on there popular person Re-ID benchmarks. * This also exploits attention mechanisms.• This is with a ResNet-152 backbone. This is with a DenseNet-121 backbone. ‡ Official codes are not released. We report the numbers in the original paper, which are better than our re-implementation.comes 3.40% for top-1 and 6.40% for mAP. We also considered SVDNet [13] and HA- CNN [50] which also proposed to generate diverse and uncorrelated feature embeddings. ABD-Net surpasses both with significant top-1 and mAP improvement. Overall, our observations endorse the superiority of ABD-Net by combing "attentive" and "diverse". VisualizationsAttention Pattern Visualization: We conduct a set of attention visualizations * * on the final output feature maps of the baseline (XE), baseline (XE) + PAM + CAM, and ABD-Net (XE), as shown in Fig.5. We notice that the feature maps from the baseline show little attentiveness. PAM + * * Grad-CAM visualization method [73]: https://github.com/ utkuozbulak/pytorch-cnn-visualizations; RAM visualization method [74] for testing images. More results can be found in the supplementary.
Accessory sigma factors, which reprogram RNA polymerase to transcribe specific gene sets, activate bacterial adaptive responses to noxious environments. Here we reconstruct the complete sigma factor regulatory network of the human pathogen Mycobacterium tuberculosis by an integrated approach. The approach combines identification of direct regulatory interactions between M. tuberculosis sigma factors in an E. coli model system, validation of selected links in M. tuberculosis, and extensive literature review. The resulting network comprises 41 direct interactions among all 13 sigma factors. Analysis of network topology reveals (i) a three-tiered hierarchy initiating at master regulators, (ii) high connectivity and (iii) distinct communities containing multiple sigma factors. These topological features are likely associated with multi-layer signal processing and specialized stress responses involving multiple sigma factors. Moreover, the identification of overrepresented network motifs, such as autoregulation and coregulation of sigma and anti-sigma factor pairs, provides structural information that is relevant for studies of network dynamics.
In natural language processing (NLP), enormous pre-trained models like BERT have become the standard starting point for training on a range of downstream tasks, and similar trends are emerging in other areas of deep learning. In parallel, work on the lottery ticket hypothesis has shown that models for NLP and computer vision contain smaller matching subnetworks capable of training in isolation to full accuracy and transferring to other tasks. In this work, we combine these observations to assess whether such trainable, transferrable subnetworks exist in pre-trained BERT models. For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity. We find these subnetworks at (pre-trained) initialization, a deviation from prior NLP research where they emerge only after some amount of training. Subnetworks found on the masked language modeling task (the same task used to pre-train the model) transfer universally; those found on other tasks transfer in a limited fashion if at all. As large-scale pre-training becomes an increasingly central paradigm in deep learning, our results demonstrate that the main lottery ticket observations remain relevant in this context. Codes available at https://github.com/TAMU-VITA/BERT-Tickets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.