Aesthetic-based Clothing Recommendation

Yu, Wenhui; Zhang, Huidi; He, Xiangnan; Chen, Xu; Xiong, Li; Qin, Zheng

doi:10.1145/3178876.3186146

Cited by 188 publications

(97 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Besides, they devise a personalized fashion design system based on the learned CNN-F and user representations. Yu et al [46] propose to introduce aesthetic information into fashion recommendation. To achieve this, they extract aesthetic features using a pre-trained brain-inspired deep structure on the aesthetic assessment task.…”

Section: Visual Understandingmentioning

confidence: 99%

“…For example, Ma et al [30] build a universal taxonomy to quantitatively describe aesthetic characteristics of clothing. Yu et al [46] propose to encode aesthetic information by pre-training models on aesthetic assessment datasets. However, none of them is for outfit recommendation and none improves visual understanding and matching like we do.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Improving Outfit Recommendation with Co-supervision of Fashion Generation

Lin

Ren

Chen

et al. 2019

The World Wide Web Conference

View full text Add to dashboard Cite

The task of fashion recommendation includes two main challenges: visual understanding and visual matching. Visual understanding aims to extract effective visual features. Visual matching aims to model a human notion of compatibility to compute a match between fashion items. Most previous studies rely on recommendation loss alone to guide visual understanding and matching. Although the features captured by these methods describe basic characteristics (e.g., color, texture, shape) of the input items, they are not directly related to the visual signals of the output items (to be recommended). This is problematic because the aesthetic characteristics (e.g., style, design), based on which we can directly infer the output items, are lacking. Features are learned under the recommendation loss alone, where the supervision signal is simply whether the given two items are matched or not.To address this problem, we propose a neural co-supervision learning framework, called the FAshion Recommendation Machine (FARM). FARM improves visual understanding by incorporating the supervision of generation loss, which we hypothesize to be able to better encode aesthetic information. FARM enhances visual matching by introducing a novel layer-to-layer matching mechanism to fuse aesthetic information more effectively, and meanwhile avoiding paying too much attention to the generation quality and ignoring the recommendation performance.Extensive experiments on two publicly available datasets show that FARM outperforms state-of-the-art models on outfit recommendation, in terms of AUC and MRR. Detailed analyses of generated and recommended items demonstrate that FARM can encode better features and generate high quality images as references to improve recommendation performance.

show abstract

Section: Visual Understandingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Improving Outfit Recommendation with Co-supervision of Fashion Generation

Lin

Ren

Chen

et al. 2019

The World Wide Web Conference

View full text Add to dashboard Cite

show abstract

“…Theoretically, latent dimensions can capture all relevant factors, but they usually cannot in applications due to the sparsity of the datasets, thus extra information is desired. The visual features are widely used since users' decisions depend largely on products' appearance [14,27,44,48]. [14,27] predicted consumers' behavior with the CNN feature.…”

Section: Side Information Featuresmentioning

confidence: 99%

Spectrum-enhanced Pairwise Learning to Rank

Qin

2019

The World Wide Web Conference

Self Cite

View full text Add to dashboard Cite

To enhance the performance of the recommender system, side information is extensively explored with various features (e.g., visual features and textual features). However, there are some demerits of side information: (1) the extra data is not always available in all recommendation tasks; (2) it is only for items, there is seldom highlevel feature describing users. To address these gaps, we introduce the spectral features extracted from two hypergraph structures of the purchase records. Spectral features describe the similarity of users/items in the graph space, which is critical for recommendation. We leverage spectral features to model the users' preference and items' properties by incorporating them into a Matrix Factorization (MF) model.In addition to modeling, we also use spectral features to optimize. Bayesian Personalized Ranking (BPR) is extensively leveraged to optimize models in implicit feedback data. However, in BPR, all missing values are regarded as negative samples equally while many of them are indeed unseen positive ones. We enrich the positive samples by calculating the similarity among users/items by the spectral features. The key ideas are: (1) similar users shall have similar preference on the same item; (2) a user shall have similar perception on similar items. Extensive experiments on two real-world datasets demonstrate the usefulness of the spectral features and the effectiveness of our spectrum-enhanced pairwise optimization. Our models outperform several state-of-the-art models significantly.

show abstract

“…They allow the creation of models that can analyze any picture and predict their aesthetic value, without the need for any annotated data about its contents; and without making use of hand-crafted features. Some examples of the use of CNNs for image aesthetics prediction and related topics can be found in [2,20,33,6,11,4,9,35,10,17]. Some of those papers make use of information about the contents of the pictures to improve the predictions of the models.…”

Section: Related Work 21 Computational Aesthetic Assessment In Photomentioning

confidence: 99%

Personalised Aesthetics with Residual Adapters

Rodríguez-Pardo

Bilen

2019

Pattern Recognition and Image Analysis

View full text Add to dashboard Cite

The use of computational methods to evaluate aesthetics in photography has gained interest in recent years due to the popularization of convolutional neural networks and the availability of new annotated datasets. Most studies in this area have focused on designing models that do not take into account individual preferences for the prediction of the aesthetic value of pictures. We propose a model based on residual learning that is capable of learning subjective, userspecific preferences over aesthetics in photography, while surpassing the stateof-the-art methods and keeping a limited number of user-specific parameters in the model. Our model can also be used for picture enhancement, and it is suitable for content-based or hybrid recommender systems in which the amount of computational resources is limited.The problem of taking into account subjective preferences on image aesthetics prediction is referred to as personalized image aesthetics [27]. Most recent approaches to image aesthetics evaluation have used different deep-learning models, which require a significant amount of annotated data for their training and evaluation. In real-world situations, it is unrealistic to assume that we will have thousands of annotated examples of rated images for any given user. This puts limits on the use of deep learning models for personalized image aesthetics prediction.In order to train a machine learning model capable of taking into account individual preferences over aesthetics in photography, an annotated dataset with the identities of the raters of each picture is needed. One example of this kind of dataset is the FLICKER-AES dataset, presented by Ren et al. [27], which contains over 40000 images rated by more than 200 different human raters. Their study provides, along with this dataset (and another, smaller, dataset), a residual-based learning model capable of taking into account user-specific preferences over aesthetics in photography.We build on their work, and propose an end-to-end convolutional neural network model capable of modelling user-specific preferences with different levels of abstraction, while keeping a reduced number of user-specific parameters within the model. Our method models user-specific preferences by using residual adapters, which were presented in [26,25] and have shown success in multi-domain learning. The main difference between our model and Ren et al.'s is that they model user-specific preferences by first training a generic aesthetics network, which predicts a mean aesthetic score, and computes a user-specific offset by training a Support Vector Regressor using the predicted content and some manually-defined attributes of the picture as its input; whereas our model embeds the user-specific parameters in different layers of the neural network, therefore allowing the model to find user-specific features with different levels of abstraction, and which do not necessarily depend on the contents and a fixed set of attributes of the pictures.Our main contributions are as follows: First, we propo...

show abstract

Aesthetic-based Clothing Recommendation

Cited by 188 publications

References 38 publications

Improving Outfit Recommendation with Co-supervision of Fashion Generation

Improving Outfit Recommendation with Co-supervision of Fashion Generation

Spectrum-enhanced Pairwise Learning to Rank

Personalised Aesthetics with Residual Adapters

Contact Info

Product

Resources

About