Rapid Clothing Retrieval via Deep Learning of Binary Codes and Hierarchical Search

Lin, Kevin; Yang, Huei-Fang; Liu, Kuan-Hsien; Hsiao, Jen-Hao; Chen, Chu-Song

doi:10.1145/2671188.2749318

Cited by 66 publications

(34 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Whereas visual similarity asks "what looks like this? ", and is fairly well understood [11,32,42,30], compatibility instead asks "what complements this?" It requires capturing how multiple visual items interact, often according to subtle visual properties.…”

Section: Outfit #1mentioning

confidence: 99%

See 1 more Smart Citation

Creating Capsule Wardrobes from Fashion Images

Hsiao

Grauman

2018

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

119

View full text Add to dashboard Cite

We propose to automatically create capsule wardrobes. Given an inventory of candidate garments and accessories, the algorithm must assemble a minimal set of items that provides maximal mix-and-match outfits. We pose the task as a subset selection problem. To permit efficient subset selection over the space of all outfit combinations, we develop submodular objective functions capturing the key ingredients of visual compatibility, versatility, and user-specific preference. Since adding garments to a capsule only expands its possible outfits, we devise an iterative approach to allow near-optimal submodular function maximization. Finally, we present an unsupervised approach to learn visual compatibility from "in the wild" full body outfit photos; the compatibility metric translates well to cleaner catalog photos and improves over existing methods. Our results on thousands of pieces from popular fashion websites show that automatic capsule creation has potential to mimic skilled fashionistas in assembling flexible wardrobes, while being significantly more scalable.

show abstract

Section: Outfit #1mentioning

confidence: 99%

“…Compatibility and recommendation Substantial prior work explores ways to link images containing the same or very similar garment [11,32,42,21,30]. In contrast, compatibility requires judging how well-coordinated or complementary a given set of garments is.…”

Section: Related Workmentioning

confidence: 99%

Creating Capsule Wardrobes from Fashion Images

Hsiao

Grauman

2018

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

119

View full text Add to dashboard Cite

show abstract

“…Text based latent Dirichlet allocation approaches have been used to represent and search products on e-commerce sites [11], [12]. Image based CNN features have been used in clothing retrieval [13]. The use of multimodal representations in an e-commerce context, which we advocate in this paper, is novel.…”

Section: Task 2 (Txt2img)mentioning

confidence: 99%

Fashion Meets Computer Vision and NLP at e-Commerce Search

Zoghbi¹,

Heyman²,

Gómez³

et al. 2016

IJCEE

View full text Add to dashboard Cite

Abstract:In this paper, we focus on cross-modal (visual and textual) e-commerce search within the fashion domain. Particularly, we investigate two tasks: 1) given a query image, we retrieve textual descriptions that correspond to the visual attributes in the query; and 2) given a textual query that may express an interest in specific visual product characteristics, we retrieve relevant images that exhibit the required visual attributes. To this end, we introduce a new dataset that consists of 53,689 images coupled with textual descriptions. The images contain fashion garments that display a great variety of visual attributes, such as different shapes, colors and textures in natural language. Unlike previous datasets, the text provides a rough and noisy description of the item in the image. We extensively analyze this dataset in the context of cross-modal e-commerce search. We investigate two state-of-the-art latent variable models to bridge between textual and visual data: bilingual latent Dirichlet allocation and canonical correlation analysis. We use state-of-the-art visual and textual features and report promising results.

show abstract

“…In [34], the style retrieval uses Siamese CNN to transform features into a latent space. To achieve fast retrieval in a large scale dataset, [23] devised hashes-like representations learned by a latent layer added to the network during fine-tuning on the clothing dataset. Another emerging topic is fashionability: predict how fashionable a person looks.…”

Section: Related Workmentioning

confidence: 99%

Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation

Yang¹,

Rodriguez²,

Crucianu³

et al. 2016

MultiMedia Modeling

View full text Add to dashboard Cite

In this paper we introduce a new method for extracting deformable clothing items from still images by extending the output of a Fully Convolutional Neural Network (FCN) to infer context from local units (superpixels). To achieve this we optimize an energy function, that combines the large scale structure of the image with the local low-level visual descriptions of superpixels, over the space of all possible pixel labelings. To assess our method we compare it to the unmodified FCN network used as a baseline, as well as to the well-known Paper Doll and Co-parsing methods for fashion images.

show abstract

Rapid Clothing Retrieval via Deep Learning of Binary Codes and Hierarchical Search

Cited by 66 publications

References 14 publications

Creating Capsule Wardrobes from Fashion Images

Creating Capsule Wardrobes from Fashion Images

Fashion Meets Computer Vision and NLP at e-Commerce Search

Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation

Contact Info

Product

Resources

About