2017
DOI: 10.1134/s1054661817040058
|View full text |Cite
|
Sign up to set email alerts
|

The TF-IDF measure and analysis of links between words within N-grams in the formation of knowledge units for open tests

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 4 publications
0
3
0
Order By: Relevance
“…We applied the term frequency-inverse document frequency (TF-IDF) algorithm (20)(21)(22) and cosine similarity to match publication abstracts and patent documents by assessing text similarity. For each hospital, we built one TF-IDF library for publications and one for patents and then calculated two text similarity matrixes based on each TF-IDF library.…”
Section: Data Collectionmentioning
confidence: 99%
“…We applied the term frequency-inverse document frequency (TF-IDF) algorithm (20)(21)(22) and cosine similarity to match publication abstracts and patent documents by assessing text similarity. For each hospital, we built one TF-IDF library for publications and one for patents and then calculated two text similarity matrixes based on each TF-IDF library.…”
Section: Data Collectionmentioning
confidence: 99%
“…Once the Document Frequency value of each feature is calculated, appropriate features are selected through the threshold. (Kshirsagar et al, 2020;Mestry et al, 2019;Gao et al, 2019;Kermani et al, 2019;Othman et al, 2019;Sidorov et al, 2019;Maryam et al, 2018;Das et al, 2018;Yamout et al, 2018;White et al, 2018;Alsmadi et al, 2018;Gu et al, 2018;Emelyanov et al, 2017;Chen et al, 2017;Krouska et al, 2016).…”
Section: Features Selection and Extractionmentioning
confidence: 99%
“…There are three main types of similarity computations [8]. The first type is based on keyword matching, e.g., the N-gram method [9] or the Jaccard method [10]. The second type is based on vector spaces and is the most widely used method at present.…”
Section: Introductionmentioning
confidence: 99%