2022
DOI: 10.48550/arxiv.2201.02327
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the Effectiveness of Sampled Softmax Loss for Item Recommendation

Abstract: Learning objectives of recommender models remain largely unexplored. Most methods routinely adopt either pointwise (e.g., binary cross-entropy) or pairwise (e.g., BPR) loss to train the model parameters, while rarely pay attention to softmax loss due to the high computational cost. Sampled softmax loss emerges as an efficient substitute for softmax loss. Its special case, InfoNCE loss, has been widely used in self-supervised learning and exhibited remarkable performance for contrastive learning. Nonetheless, l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
15
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(17 citation statements)
references
References 20 publications
1
15
1
Order By: Relevance
“…In prior works [2][3][4]50], ๐‘„ is often generic sampling distributions such as log-uniform and uniform sampling, which are shown to perform relatively well in general RS learning [44]. Specifically, using log-uniform sampling on the item set sorted by popularity gives the popular items a higher probability of being selected as negative samples.…”
Section: Sampled Softmax Lossmentioning
confidence: 99%
See 1 more Smart Citation
“…In prior works [2][3][4]50], ๐‘„ is often generic sampling distributions such as log-uniform and uniform sampling, which are shown to perform relatively well in general RS learning [44]. Specifically, using log-uniform sampling on the item set sorted by popularity gives the popular items a higher probability of being selected as negative samples.…”
Section: Sampled Softmax Lossmentioning
confidence: 99%
“…First, we raise doubts about the effectiveness of uniformly sampled softmax in multi-interest scenarios. While uniformly sampled softmax has been shown to be effective in training general recommendation systems [44], it falls short in multi-interest recommendation systems, such as ComiRec [2] and PIMIRec [4]. As illustrated in Figure 1 (a), it has significantly worse performance compared to full softmax within a reasonable sample size range (e.g., below a thousand).…”
Section: Introductionmentioning
confidence: 99%
“…There are multiple choices of loss functions for training a recommendation model including pointwise loss (e.g., BCE [12,26], MSE [9,17]), pairwise loss(e.g., BPR [25]) and Softmax loss [34]. Recent work [34] finds Softmax loss could mitigate popularity bias, achieves great training stability, and aligns well with the ranking metric. It usually achieves better performance than others and thus attracts a surge of interest in recommendation.…”
Section: Preliminariesmentioning
confidence: 99%
“…1 a). Here we follow [34] and split items into ten groups in terms of item popularity. The larger group ID indicates the group contains more popular items.…”
Section: Empirical Analysismentioning
confidence: 99%
See 1 more Smart Citation