Visual events are usually accompanied by sounds in our daily lives. We pose the question: Can the machine learn the correspondence between visual scene and the sound, and localize the sound source only by observing sound and visual scene pairs like human? In this paper, we propose a novel unsupervised algorithm to address the problem of localizing the sound source in visual scenes. A two-stream network structure which handles each modality, with attention mechanism is developed for sound source localization. Moreover, although our network is formulated within the unsupervised learning framework, it can be extended to a unified architecture with a simple modification for the supervised and semi-supervised learning settings as well. Meanwhile, a new sound source dataset is developed for performance evaluation. Our empirical evaluation shows that the unsupervised method eventually go through false conclusion in some cases. We show that even with a few supervision, false conclusion is able to be corrected and the source of sound in a visual scene can be localized effectively.
In this paper, we propose a unified end-to-end trainable multi-task network that jointly handles lane and road marking detection and recognition that is guided by a vanishing point under adverse weather conditions. We tackle rainy and low illumination conditions, which have not been extensively studied until now due to clear challenges. For example, images taken under rainy days are subject to low illumination, while wet roads cause light reflection and distort the appearance of lane and road markings. At night, color distortion occurs under limited illumination. As a result, no benchmark dataset exists and only a few developed algorithms work under poor weather conditions. To address this shortcoming, we build up a lane and road marking benchmark which consists of about 20,000 images with 17 lane and road marking classes under four different scenarios: no rain, rain, heavy rain, and night. We train and evaluate several versions of the proposed multi-task network and validate the importance of each task. The resulting approach, VPGNet, can detect and classify lanes and road markings, and predict a vanishing point with a single forward pass. Experimental results show that our approach achieves high accuracy and robustness under various conditions in realtime (20 fps). The benchmark and the VPGNet model will be publicly available 1 .
O-GlcNAcylation (O-linked β-N-acetylglucosaminylation) is notably decreased in Alzheimer’s disease (AD) brain. Necroptosis is activated in AD brain and is positively correlated with neuroinflammation and tau pathology. However, the links among altered O-GlcNAcylation, β-amyloid (Aβ) accumulation, and necroptosis are unclear. Here, we found that O-GlcNAcylation plays a protective role in AD by inhibiting necroptosis. Necroptosis was increased in AD patients and AD mouse model compared with controls; however, decreased necroptosis due to O-GlcNAcylation of RIPK3 (receptor-interacting serine/threonine protein kinase 3) was observed in 5xFAD mice with insufficient O-linked β-N-acetylglucosaminase. O-GlcNAcylation of RIPK3 suppresses phosphorylation of RIPK3 and its interaction with RIPK1. Moreover, increased O-GlcNAcylation ameliorated AD pathology, including Aβ burden, neuronal loss, neuroinflammation, and damaged mitochondria and recovered the M2 phenotype and phagocytic activity of microglia. Thus, our data establish the influence of O-GlcNAcylation on Aβ accumulation and neurodegeneration, suggesting O-GlcNAcylation–based treatments as potential interventions for AD.
There were small improvements in motor strength and SCIM-III scores in the RT group, but there were no statistically significant differences between the groups. Further studies are required for a better understanding of the effects of RT for people with tetraplegia.
In daily life, graphic symbols, such as traffic signs and brand logos, are ubiquitously utilized around us due to its intuitive expression beyond language boundary. We tackle an open-set graphic symbol recognition problem by one-shot classification with prototypical images as a single training example for each novel class. We take an approach to learn a generalizable embedding space for novel tasks. We propose a new approach called variational prototyping-encoder (VPE) that learns the image translation task from real-world input images to their corresponding prototypical images as a meta-task. As a result, VPE learns image similarity as well as prototypical concepts which differs from widely used metric learning based approaches. Our experiments with diverse datasets demonstrate that the proposed VPE performs favorably against competing metric learning based one-shot methods. Also, our qualitative analyses show that our meta-task induces an effective embedding space suitable for unseen data representation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.