Large-Scale Categorization of Japanese Product Titles Using Neural
            Attention Models

Xia, Yandi; Levine, A.; Das, Pradipto; Fabbrizio, Giuseppe Di; Shinzato, Keiji; Datta, Amit

doi:10.18653/v1/e17-2105

Cited by 17 publications

(17 citation statements)

References 16 publications

(16 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Each MT system in Section 4.2 tokenizes a product title into individual words and then outputs a root-to-leaf path one node category at a time, similar to how a translated sentence is generated one word at a time. The distribution of products across categories in both datasets is skewed toward the most popular categories as is usually the case in e-commerce domains [He and McAuley 2016;Xia et al 2017]. Figures 5 and 6 show the number of products in each category at the top level of the taxonomy tree (each vertical bar reflects the number of products in that category).…”

Section: Experiments 41 Datasetsmentioning

confidence: 97%

“…Kozareva [2015] uses a variety of features (e.g., n-grams, latent Dirichlet allocation topics [Blei et al 2003], and word2vec embeddings [Mikolov et al 2013]) in a multi-class algorithm. Both Ha et al [2016] and Xia et al [2017] use deep learning to learn a compact vector representation of the attributes of a product (e.g., product title, merchant ID, and product image), and use the representation to classify the product. They differ in terms of the kinds of deep learning model used.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

E-Commerce Product Categorization via Machine Translation

Tan

Kok

2020

ACM Trans. Manage. Inf. Syst.

View full text Add to dashboard Cite

E-commerce platforms categorize their products into a multi-level taxonomy tree with thousands of leaf categories. Conventional methods for product categorization are typically based on machine learning classification algorithms. These algorithms take product information as input (e.g., titles and descriptions) to classify a product into a leaf category. In this article, we propose a new paradigm based on machine translation. In our approach, we translate a product's natural language description into a sequence of tokens representing a root-to-leaf path in a product taxonomy. In our experiments on two large real-world datasets, we show that our approach achieves better predictive accuracy than a state-of-the-art classification system for product categorization. In addition, we demonstrate that our machine translation models can propose meaningful new paths between previously unconnected nodes in a taxonomy tree, thereby transforming the taxonomy into a directed acyclic graph. We discuss how the resultant taxonomy directed acyclic graph promotes user-friendly navigation, and how it is more adaptable to new products.

show abstract

Section: Experiments 41 Datasetsmentioning

confidence: 97%

Section: Related Workmentioning

confidence: 99%

E-Commerce Product Categorization via Machine Translation

Tan

Kok

2020

ACM Trans. Manage. Inf. Syst.

View full text Add to dashboard Cite

show abstract

“…Xia, Y. et al [4] classified product categories using product title data in Japanese, which is an agglutinative language. An attention conventional neural network (ACNN) was proposed using a conventional CNN model and gradient boosted tree (GBT) [18].…”

Section: Related Workmentioning

confidence: 99%

“…This approach sets the weights of some particular layers appropriately and applies it in typical models such as a convolutional neural network (CNN) [1,2]. The second approach is to develop a technically new model [3,4].…”

Section: Introductionmentioning

confidence: 99%

Developing Data-Conscious Deep Learning Models for Product Classification

2021

View full text Add to dashboard Cite

In online commerce systems that trade in many products, it is important to classify the products accurately according to the product description. As may be expected, the recent advances in deep learning technologies have been applied to automatic product classification. The efficiency of a deep learning model depends on the training data and the appropriateness of the learning model for the data domain. This is also applicable to deep learning models for automatic product classification. In this study, we propose deep learning models that are conscious of input data comprising text-based product information. Our approaches exploit two well-known deep learning models and integrate them with the processes of input data selection, transformation, and filtering. We demonstrate the practicality of these models through experiments using actual product information data. The experimental results show that the models that systematically consider the input data may differ in accuracy by approximately 30% from those that do not. This study indicates that input data should be sufficiently considered in the development of deep learning models for product classification.

show abstract

“…Deep learning-based methods have been widely used for the IC task. This includes the use of deep neural network models for item categorization in a hierarchical classifier structure which showed improved performance over conventional machine learning models (Cevahir and Murakami, 2016), as well as the use of an attention mechanism to identify words that are semantically highly correlated with the predicted categories and therefore can provide improved feature representations for a higher classification performance (Xia et al, 2017).…”

Section: Related Workmentioning

confidence: 99%

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

2021

View full text Add to dashboard Cite

show abstract

Large-Scale Categorization of Japanese Product Titles Using Neural Attention Models

Cited by 17 publications

References 16 publications

E-Commerce Product Categorization via Machine Translation

E-Commerce Product Categorization via Machine Translation

Developing Data-Conscious Deep Learning Models for Product Classification

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Contact Info

Product

Resources

About