Clothing image classification is more and more important in the development of online clothing shopping. The clothing category marking, clothing commodity retrieval, and similar clothing recommendations are the popular applications in current clothing shopping, which are based on the technology of accurate clothing image classification. Wide varieties and various styles of clothing lead to great difficulty for the accurate clothing image classification. The traditional neural network can not obtain the spatial structure information of clothing images, which leads to poor classification accuracy. In order to reach the high accuracy, the enhanced capsule (EnCaps) network is proposed with the image feature and spatial structure feature. First, the spatial structure extraction model is proposed to obtain the clothing structure feature based on the EnCaps network. Second, the enhanced feature extraction model is proposed to extract more robust clothing features based on deeper network structure and attention mechanism. Third, parameter optimization is used to reduce the computation in the proposed network based on inception mechanism. Experimental results indicate that the proposed EnCaps network achieves high performance in terms of classification accuracy and computational efficiency.
Image-based virtual try-on systems have significant commercial value in online garment shopping. However, prior methods fail to appropriately handle details, so are defective in maintaining the original appearance of organizational items including arms, the neck, and in-shop garments. We propose a novel high fidelity virtual try-on network to generate realistic results. Specifically, a distributed pipeline is used for simultaneous generation of organizational items. First, the in-shop garment is warped using thin plate splines (TPS) to give a coarse shape reference, and then a corresponding target semantic map is generated, which can adaptively respond to the distribution of different items triggered by different garments. Second, organizational items are componentized separately using our novel semantic map-based image adjustment network (SMIAN) to avoid interference between body parts. Finally, all components are integrated to generate the overall result by SMIAN. A priori dual-modal information is incorporated in the tail layers of SMIAN to improve the convergence rate of the network. Experiments demonstrate that the proposed method can retain better details of condition information than current methods. Our method achieves convincing quantitative and qualitative results on existing benchmark datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.