2018
DOI: 10.7494/csci.2018.19.3.2753
|View full text |Cite
|
Sign up to set email alerts
|

Building Semantic User Profile for Polish Web News Portal

Abstract: The aim of this research is to construct meaningful user profiles that are the most descriptive of user interests in the context of the media content that they browse. We use two distinct state-of-the-art numerical text-representation techniques: LDA topic modeling and Word2Vec word embeddings. We train our models on the collection of news articles in Polish and compare them with a model built on a general language corpus. We compare the performance of these algorithms on two practical tasks. First, we perform… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…One approach is to use other similarity measures considering the user metadata rather than behavioral patterns. It was observed by Goel et al (2012); Misztal-Radecka (2018) that there are some significant differences in the frequency of interaction with relevant categories of online services among different demographic groups. Though relevant, the demographics information is usually not available for a majority of new users and some other features are required to predict user preferences.…”
Section: User Modeling In Cold-start Situationsmentioning
confidence: 96%
See 1 more Smart Citation
“…One approach is to use other similarity measures considering the user metadata rather than behavioral patterns. It was observed by Goel et al (2012); Misztal-Radecka (2018) that there are some significant differences in the frequency of interaction with relevant categories of online services among different demographic groups. Though relevant, the demographics information is usually not available for a majority of new users and some other features are required to predict user preferences.…”
Section: User Modeling In Cold-start Situationsmentioning
confidence: 96%
“…Moreover, there have been a few attempts to apply text embedding methods to represent content-based user profiles. In Musto et al (2016); Alekseev et al (2017); Misztal-Radecka (2018), Word2Vec is used to build content-based user profiles. It was observed that this approach gives comparable results to the standard collaborative filtering techniques, especially for sparse datasets.…”
Section: From Word Vectors To User Embeddingsmentioning
confidence: 99%
“…The main challenge when modeling topics with LDA is to define the appropriate number of topics (k) to represent the whole corpus and the interpretability of the topics [3] [11]. Therefore, the coherence metric developed by [12] was applied to validate the tuples (k,α).…”
Section: Model Validationmentioning
confidence: 99%
“…They applied LDA method to extract topics from news, with LDA they obtained more reliable result than K-means clustering. Moreover, [3] worked in create user profiles based on news articles. This author used LDA and Word2Vec to represent information and demonstrated that Word2Vec works better in comparing small text like titles.…”
Section: Introductionmentioning
confidence: 99%