Word embeddings (e.g., word2vec) have been applied successfully to eCommerce products through prod2vec. Inspired by the recent performance improvements on several NLP tasks brought by contextualized embeddings, we propose to transfer BERT-like architectures to eCommerce: our model -Prod2BERT -is trained to generate representations of products through masked session modeling. Through extensive experiments over multiple shops, different tasks, and a range of design choices, we systematically compare the accuracy of Prod2BERT and prod2vec embeddings: while Prod2BERT is found to be superior in several scenarios, we highlight the importance of resources and hyperparameters in the best performing models. Finally, we provide guidelines to practitioners for training embeddings under a variety of computational and data constraints. * Federico and Bingqing contributed equally to this research. † Corresponding author. 10 Costs are from official AWS pricing, with 0.10 USD/h for the c4.large (https://aws.amazon.com/ it/ec2/pricing/on-demand/), and 12,24 USD/h for the p3.8xlarge (https://aws.amazon.com/it/ec2/ instance-types/p3/). While obviously cost optimizations are possible, the "naive" pricing is a good proxy to appreciate the difference between the two methods.
Ethical ConsiderationsUser data has been collected by Coveo in the process of providing business services: data is collected and processed in an anonymized fashion, in compliance with existing legislation. In particular, the target dataset uses only anonymous uuids to label events and, as such, it does not contain any information that can be linked to physical entities.
ReferencesSamar Al-Saqqa and Arafat Awajan. 2019. The use of word2vec model in sentiment analysis: A survey. In Proceedings of the 2019 International Conference on Artificial Intelligence, Robotics and Control, pages 39-43.