Reasoning human object interactions is a core problem in human-centric scene understanding and detecting such relations poses a unique challenge to vision systems due to large variations in human-object configurations, multiple co-occurring relation instances and subtle visual difference between relation categories. To address those challenges, we propose a multi-level relation detection strategy that utilizes human pose cues to capture global spatial configurations of relations and as an attention mechanism to dynamically zoom into relevant regions at human part level. Specifically, we develop a multi-branch deep network to learn a pose-augmented relation representation at three semantic levels, incorporating interaction context, object features and detailed semantic part cues. As a result, our approach is capable of generating robust predictions on fine-grained human object interactions with interpretable outputs. Extensive experimental evaluations on public benchmarks show that our model outperforms prior methods by a considerable margin, demonstrating its efficacy in handling complex scenes. Code is available at https://github.com/bobwan1995/PMFNet.
The key challenge for few-shot semantic segmentation (FSS) is how to tailor a desirable interaction among support and query features and/or their prototypes, under the episodic training scenario. Most existing FSS methods implement such support/query interactions by solely leveraging plain operations -e.g., cosine similarity and feature concatenation -for segmenting the query objects. However, these interaction approaches usually cannot well capture the intrinsic object details in the query images that are widely encountered in FSS, e.g., if the query object to be segmented has holes and slots, inaccurate segmentation almost always happens. To this end, we propose a dynamic prototype convolution network (DPCN) to fully capture the aforementioned intrinsic details for accurate FSS. Specifically, in DPCN, a dynamic convolution module (DCM) is firstly proposed to generate dynamic kernels from support foreground, then information interaction is achieved by convolution operations over query features using these kernels. Moreover, we equip DPCN with a support activation module (SAM) and a feature filtering module (FFM) to generate pseudo mask and filter out background information for the query images, respectively. SAM and FFM together can mine enriched context information from the query features. Our DPCN is also flexible and efficient under the k-shot FSS setting. Extensive experiments on PASCAL-5 i and COCO-20 i show that DPCN yields superior performances under both 1-shot and 5-shot settings.
BackgroundL-phenylalanine (L-Phe) is an essential amino acid for mammals and applications expand into human health and nutritional products. In this study, a system level engineering was conducted to enhance L-Phe biosynthesis in Escherichia coli.ResultsWe inactivated the PTS system and recruited glucose uptake via combinatorial modulation of galP and glk to increase PEP supply in the Xllp01 strain. In addition, the HTH domain of the transcription factor TyrR was engineered to decrease the repression on the transcriptional levels of L-Phe pathway enzymes. Finally, proteomics analysis demonstrated the third step of the SHIK pathway (catalyzed via AroD) as the rate-limiting step for L-Phe production. After optimization of the aroD promoter strength, the titer of L-Phe increased by 13.3%. Analysis of the transcriptional level of genes involved in the central metabolic pathways and L-Phe biosynthesis via RT-PCR showed that the recombinant L-Phe producer exhibited a great capability in the glucose utilization and precursor (PEP and E4P) generation. Via systems level engineering, the L-Phe titer of Xllp21 strain reached 72.9 g/L in a 5 L fermenter under the non-optimized fermentation conditions, which was 1.62-times that of the original strain Xllp01.ConclusionThe metabolic engineering strategy reported here can be broadly employed for developing genetically defined organisms for the efficient production of other aromatic amino acids and derived compounds.Electronic supplementary materialThe online version of this article (10.1186/s12896-018-0418-1) contains supplementary material, which is available to authorized users.
Visual grounding is a ubiquitous building block in many vision-language tasks and yet remains challenging due to large variations in visual and linguistic features of grounding entities, strong context effect and the resulting semantic ambiguities. Prior works typically focus on learning representations of individual phrases with limited context information. To address their limitations, this paper proposes a language-guided graph representation to capture the global context of grounding entities and their relations, and develop a cross-modal graph matching strategy for the multiple-phrase visual grounding task. In particular, we introduce a modular graph neural network to compute context-aware representations of phrases and object proposals respectively via message propagation, followed by a graph-based matching module to generate globally consistent localization of grounding phrases. We train the entire graph neural network jointly in a two-stage strategy and evaluate it on the Flickr30K Entities benchmark. Extensive experiments show that our method outperforms the prior state of the arts by a sizable margin, evidencing the efficacy of our grounding framework. Code is available at https://github.com/youngfly11/LCMCG-PyTorch.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.