2017
DOI: 10.1007/978-3-319-71746-3_14
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of Vector Space Representations of Documents for the Task of Information Retrieval of Massive Open Online Courses

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 6 publications
0
5
0
Order By: Relevance
“…Although feature selection techniques reduce the dimensionality of textual data up to great extent, however, traditional machine learning based text document classification methodologies still face the sparsity problem in bag of words based feature representation techniques [27], [28]. Bag of words based feature representation techniques consider unigrams, n-grams or specific patterns as features [27], [28]. These algorithms do not capture the complete contextual information of data and also face the problem of data sparsity [27], [28].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Although feature selection techniques reduce the dimensionality of textual data up to great extent, however, traditional machine learning based text document classification methodologies still face the sparsity problem in bag of words based feature representation techniques [27], [28]. Bag of words based feature representation techniques consider unigrams, n-grams or specific patterns as features [27], [28]. These algorithms do not capture the complete contextual information of data and also face the problem of data sparsity [27], [28].…”
Section: Introductionmentioning
confidence: 99%
“…Bag of words based feature representation techniques consider unigrams, n-grams or specific patterns as features [27], [28]. These algorithms do not capture the complete contextual information of data and also face the problem of data sparsity [27], [28]. These problems are solved by word embeddings which do not only capture syntactic but semantic information of textual data as well [29].…”
Section: Introductionmentioning
confidence: 99%
“…The authors of the ARTM, have proposed various sets of regularizers that increase the interpretability, sparsity, and variation in the topics, produced by the model. In a series of experiments, the level of quality was achieved, comparable if not superior to the common approaches of the word2vec family [13].…”
Section: Overview Of Topic Modelling Approachesmentioning
confidence: 87%
“…∈ & ∀ ∈ ( ∪ − ) ⟹ ∈ ′ Then, the main metrics for evaluating the quality of clustering were calculated (12)(13)(14)(15):…”
Section: ∈ ⟹ ( ) ∈mentioning
confidence: 99%
“…As an example, Klenin and Botov [3] presented the results of the evaluation of various vector space models, namely TF-IDF, LSA, LDA, averaged Word2vec, and Paragraph2vec, within the context of educational course program documents. Paragraph2Vec, which involves using various neural networks to learn word and document vectors in low-dimensional space approaches, provides the best results for both clustering and classification tasks.…”
Section: Discussionmentioning
confidence: 99%