Methotrexate/iatrogenic lymphoproliferative disorders in rheumatoid arthritis: histology, <scp>E</scp>pstein–<scp>B</scp>arr virus, and clonality are important predictors of disease progression and regression

Many modern approaches for object detection are two-staged pipelines. The first stage identifies regions of interest which are then classified in the second stage. Faster R-CNN is such an approach for object detection which combines both stages into a single pipeline. In this paper we apply Faster R-CNN to the task of company logo detection. Motivated by its weak performance on small object instances, we examine in detail both the proposal and the classification stage with respect to a wide range of object sizes. We investigate the influence of feature map resolution on the performance of those stages.Based on theoretical considerations, we introduce an improved scheme for generating anchor proposals and propose a modification to Faster R-CNN which leverages higher-resolution feature maps for small objects. We evaluate our approach on the FlickrLogos dataset improving the RPN performance from 0.52 to 0.71 (MABO) and the detection performance from 0.52 to 0.67 (mAP).

show abstract

High-Resolution Dual-Stage Multi-Level Feature Aggregation for Single Image and Video Deblurring

Brehm

Scherer

Lienhart

2020

View full text Add to dashboard Cite

A Convolutional Sequence to Sequence Model for Multimodal Dynamics Prediction in Ski Jumps

Zecha

Eggert

Einfalt

et al. 2018

View full text Add to dashboard Cite

Multimodal Image Captioning for Marketing Analysis

Harzig

Brehm

Lienhart

et al. 2018

View full text Add to dashboard Cite

Automatically captioning images with natural language sentences is an important research topic. State of the art models are able to produce human-like sentences. These models typically describe the depicted scene as a whole and do not target specific objects of interest or emotional relationships between these objects in the image. However, marketing companies require to describe these important attributes of a given scene. In our case, objects of interest are consumer goods, which are usually identifiable by a product logo and are associated with certain brands. From a marketing point of view, it is desirable to also evaluate the emotional context of a trademarked product, i.e., whether it appears in a positive or a negative connotation. We address the problem of finding brands in images and deriving corresponding captions by introducing a modified image captioning network. We also add a third output modality, which simultaneously produces real-valued image ratings. Our network is trained using a classificationaware loss function in order to stimulate the generation of sentences with an emphasis on words identifying the brand of a product. We evaluate our model on a dataset of images depicting interactions between humans and branded products. The introduced network improves mean class accuracy by 24.5 percent. Thanks to adding the third output modality, it also considerably improves the quality of generated captions for images depicting branded products.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Stephan Brehm

A closer look: Small object detection in faster R-CNN

Improving Small Object Proposals for Company Logo Detection

High-Resolution Dual-Stage Multi-Level Feature Aggregation for Single Image and Video Deblurring

A Convolutional Sequence to Sequence Model for Multimodal Dynamics Prediction in Ski Jumps

Multimodal Image Captioning for Marketing Analysis

Contact Info

Product

Resources

About