2007
DOI: 10.1145/1236471.1236473
|View full text |Cite
|
Sign up to set email alerts
|

Efficient sampling of training set in large and noisy multimedia data

Abstract: As the amount of multimedia data is increasing day-by-day thanks to less expensive storage devices and increasing numbers of information sources, machine learning algorithms are faced with large-sized and noisy datasets. Fortunately, the use of a good sampling set for training influences the final results significantly. But using a simple random sample (SRS) may not obtain satisfactory results because such a sample may not adequately represent the large and noisy dataset due to its blind approach in selecting … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2008
2008
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 37 publications
0
5
0
Order By: Relevance
“…There are various solutions to deal with the noisy data. For instance, to employ some noise filter mechanisms to smooth the noisy data or to apply an appropriate sampling technique to differentiate the noisy data from the input data before the fusion takes place [137].…”
Section: The Workmentioning
confidence: 99%
“…There are various solutions to deal with the noisy data. For instance, to employ some noise filter mechanisms to smooth the noisy data or to apply an appropriate sampling technique to differentiate the noisy data from the input data before the fusion takes place [137].…”
Section: The Workmentioning
confidence: 99%
“…In (Tseng et al, 2008), collaborative filtering is applied to the intelligent multimedia recommender based on the assumption that people can be grouped to share their valuable purchase information with each other to make the choices about the preferred items, such as movies and books. In recent years, several categories of data filtering methods have been investigated for multimedia retrieval such as algorithms using posterior probability (Angelova, Abu-Mostafa & Perona, 2005;Vezhnevets & Barinova, 2007), techniques based on distance or density measure (Chen et al, 2007;Wang, Dash, Chia, & Xu, 2007), and clustering based data filtering methods (Sarawagi, Deshpande & Kasliwal, 2009;Xiong, Gaurav, Steinbach & Vipin, 2006).…”
Section: Introductionmentioning
confidence: 99%
“…Their method is based on the Nearest Neighbor classifier. Wang et al [5], present a method to sample a large and noisy multimedia data. Their method is based on a simple distance measure that compares the histograms of the sample set and the whole set in order to assess the representativeness of the sample set.…”
Section: Related Workmentioning
confidence: 99%