Visruth Premraj scite author profile

We posit that visually descriptive language offers computer vision researchers both information about the world, and information about how people describe the world. The potential benefit from this source is made more significant due to the enormous amount of language data easily available today. We present a system to automatically generate natural language descriptions from images that exploits both statistics gleaned from parsing large quantities of text data and recognition algorithms from computer vision. The system is very effective at producing relevant sentences for images. It also generates descriptions that are notably more true to the specific image content than previous work.

show abstract

BabyTalk: Understanding and Generating Simple Image Descriptions

Kulkarni

Premraj

Ordóñez

et al. 2013

IEEE Trans. Pattern Anal. Mach. Intell.

704

345

View full text Add to dashboard Cite

Abstract-We present a system to automatically generate natural language descriptions from images. This system consists of two parts. The first part, content planning, smooths the output of computer vision-based detection and recognition algorithms with statistics mined from large pools of visually descriptive text to determine the best content words to use to describe an image. The second step, surface realization, chooses words to construct natural language sentences based on the predicted content and general statistics from natural language. We present multiple approaches for the surface realization step and evaluate each using automatic measures of similarity to human generated reference descriptions. We also collect forced choice human evaluations between descriptions from the proposed generation system and descriptions from competing approaches. The proposed system is very effective at producing relevant sentences for images. It also generates descriptions that are notably more true to the specific image content than previous work.

show abstract

iWalk

Premraj

Schedel

Berg

2010

View full text Add to dashboard Cite

In this work, we present iWalk, a multimedia exploration tool that provides an interactive virtual environment for physically exploring geo-tagged data. This tool is flexible enough for users to easily explore their own collections, or existing collections from the web. Two interaction modalities are incorporated into our tool -movement and gesture. Movement (walking around the physical space) is used to intuitively move through the digital data space of a collection. Gesture is used for direct data manipulation; the user is able to select the mapping between gestures and interactions. In addition, we also provide functionality for exploring data that is not geo-located. Here the user defines a rough mapping between the data collection space and the physical interaction space and then operates the program as usual. We have currently tested the system on three different data sets -a large collection of geo-tagged photographs from Flickr, a geo-located sound collection, and a museum collection that is not geo-located.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Visruth Premraj

Baby talk: Understanding and generating simple image descriptions

BabyTalk: Understanding and Generating Simple Image Descriptions

iWalk

Contact Info

Product

Resources

About