2014
DOI: 10.1007/978-3-319-10590-1_36
|View full text |Cite
|
Sign up to set email alerts
|

Learning Discriminative and Shareable Features for Scene Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
52
0
1

Year Published

2016
2016
2020
2020

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 73 publications
(53 citation statements)
references
References 36 publications
0
52
0
1
Order By: Relevance
“…For the dataset of Places365-standard, we use step size of 150,000 to decrease the learning rate and the whole train process stops at 600,000 iterations. To speed the training process, we use the multi-GPU extension [52] of Caffe [53] toolbox for our CNN training 4 . For testing our models, we use the common 5 crops (4 corners and 1 center) and their horizontal flipping for each image at a single scale, thus having 10 crops in total for each image.…”
Section: A Large-scale Datasets and Implementation Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…For the dataset of Places365-standard, we use step size of 150,000 to decrease the learning rate and the whole train process stops at 600,000 iterations. To speed the training process, we use the multi-GPU extension [52] of Caffe [53] toolbox for our CNN training 4 . For testing our models, we use the common 5 crops (4 corners and 1 center) and their horizontal flipping for each image at a single scale, thus having 10 crops in total for each image.…”
Section: A Large-scale Datasets and Implementation Detailsmentioning
confidence: 99%
“…S CENE recognition is a fundamental problem in computer vision, and has received increasing attention in the past few years [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]. Scene recognition not only provides rich semantic information of global structure [12], but also yields meaningful context that facilitates other related vision tasks, such as object detection [13], [14], [15], event recognition [16], [17], [18], and action recognition [19], [20], [21].…”
mentioning
confidence: 99%
“…We have re-labeled those images and will make the labels publicly available. As the MIT Indoor67 dataset is widely used in performance evaluation, e.g., [11], [4], [5], [6], [18], [19], [20], [21], [16], [22], [23], [24], [25], [26], [14], [27], the notion of 4 % annotation error will likely contribute to revising the results obtained so far on this dataset. We show that the proposed model outperforms the top-performing approach [11] in robustness to occlusion and aspect change.…”
Section: A Our Approach and Contributionsmentioning
confidence: 99%
“…In [11], for example, the authors achieve state-of-the-art place categorization performance by extracting CNN features from images. Several researchers, e.g., [22], [4] proposed concatenating the CNN features with additional features to increase the performance of the final descriptor. Authors of [19] propose visual words extraction by CNN patches sampled at multiple scales.…”
Section: Related Workmentioning
confidence: 99%
“…However, shadows can also cause complications in image processing and computer vision. They can degrade the performance of object recognition [1], image feature extraction [2], scene analysis [3] and face recognition [4]. It is easy for the human eye to distinguish shadows from objects, but identifying shadows by computer is a challenging research problem.…”
Section: Introductionmentioning
confidence: 99%