2010
DOI: 10.1007/s13042-010-0001-0
|View full text |Cite
|
Sign up to set email alerts
|

Understanding bag-of-words model: a statistical framework

Abstract: The bag-of-words model is one of the most popular representation methods for object categorization.The key idea is to quantize each extracted key point into one of visual words, and then represent each image by a histogram of the visual words. For this purpose, a clustering algorithm (e.g., K-means), is generally used for generating the visual words. Although a number of studies have shown encouraging results of the bag-of-words representation for object categorization, theoretical studies on properties of the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
439
0
18

Year Published

2016
2016
2023
2023

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 1,091 publications
(457 citation statements)
references
References 17 publications
0
439
0
18
Order By: Relevance
“…3, a bag-of-words feature model is used to represent each unstructured feature extracted from the ticket. A bag-of-words representation is known to extract good patterns from unstructured text data [80]. The bagof-words model can be learnt over a vector of unigrams or bigrams or both extracted from text data.…”
Section: Feature Extractionmentioning
confidence: 99%
“…3, a bag-of-words feature model is used to represent each unstructured feature extracted from the ticket. A bag-of-words representation is known to extract good patterns from unstructured text data [80]. The bagof-words model can be learnt over a vector of unigrams or bigrams or both extracted from text data.…”
Section: Feature Extractionmentioning
confidence: 99%
“…There are many different techniques for feature extraction, however, BoW models [6] and distributed representation [7] is the most popular methods used in NLP. TF-IDF [8] is one of the widely used method of BoW models, it's simplistic but surprisingly useful in practice.…”
Section: Feature Vector Representation Of Newsmentioning
confidence: 99%
“…Code Fragment: A continuous segment of source code, specified by the triple (l, s, e), including the source file l, the line the fragment starts on, s, and the line it ends on, e. 9 1 Similar to the popular bag-of-words model [39] in Information Retrieval Clone Pair: A pair of code fragments that are similar, specified by the triple (f1, f2, φ), including the similar code fragments f1 and f2, and their clone type φ.…”
Section: Definitionsmentioning
confidence: 99%