Procedings of the British Machine Vision Conference 2009 2009
DOI: 10.5244/c.23.2
|View full text |Cite
|
Sign up to set email alerts
|

Learning Models for Object Recognition from Natural Language Descriptions

Abstract: We investigate the task of learning models for visual object recognition from natural language descriptions alone. The approach contributes to the recognition of fine-grain object categories, such as animal and plant species, where it may be difficult to collect many images for training, but where textual descriptions of visual attributes are readily available. As an example we tackle recognition of butterfly species, learning models from descriptions in an online nature guide. We propose natural language proc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
125
0
3

Year Published

2013
2013
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 182 publications
(129 citation statements)
references
References 20 publications
1
125
0
3
Order By: Relevance
“…Attribute-based representations have recently received much attention because they have been successfully used for image retrieval (Yu et al, 2012), for recognizing objects (Duan et al, 2012;Wang et al, 2009), for describing unknown objects (Farhadi et al, 2009), and even for learning new unseen object models from descriptions (Farhadi et al, 2009;Lampert et al, 2009). Facial attributes have a key role in human-computer interaction applications, image and video retrieval and surveillance.…”
Section: Introductionmentioning
confidence: 99%
“…Attribute-based representations have recently received much attention because they have been successfully used for image retrieval (Yu et al, 2012), for recognizing objects (Duan et al, 2012;Wang et al, 2009), for describing unknown objects (Farhadi et al, 2009), and even for learning new unseen object models from descriptions (Farhadi et al, 2009;Lampert et al, 2009). Facial attributes have a key role in human-computer interaction applications, image and video retrieval and surveillance.…”
Section: Introductionmentioning
confidence: 99%
“…We illustrate harvesting training images for ten butterfly categories of the Leeds Butterfly Dataset [15], using the provided eNature visual descriptions. Figure 2 shows the pipeline for our method, starting from the butterfly species' name and visual description.…”
Section: Overviewmentioning
confidence: 99%
“…The seed phrases are restricted to noun phrases and adjective phrases, obtained via phrase chunking as in [15]. The number of seed phrases per category ranges from 5 to 17 depending on the length of the description; an example list is shown in Fig.…”
Section: Search Engine Querymentioning
confidence: 99%
“…This is a cost-effective alternative to hand-listing attributes [10,15] and to architectures which require a human-in-the-loop [25]. Existing solutions [1,34,35] were typically developed for visual object recognition tasks. [34] proposes to mine pre-existing natural language resources.…”
Section: Introductionmentioning
confidence: 99%
“…Existing solutions [1,34,35] were typically developed for visual object recognition tasks. [34] proposes to mine pre-existing natural language resources. [1] uses mutual information to learn attributes relevant for e-commerce categories (handbags, shoes, earrings and ties) [8] uses latent CRF to discover detectable and discriminative attributes.…”
Section: Introductionmentioning
confidence: 99%