Fine-Grained Visual Classification via Progressive Multi-granularity Training of Jigsaw Patches

Du, Ruoyi; Chang, Dongliang; Bhunia, Ayan Kumar; Xie, Jiyang; Ma, Zhanyu; Song, Yi-Zhe; Guo, Jun

doi:10.1007/978-3-030-58565-5_10

Cited by 286 publications

(150 citation statements)

References 34 publications

Supporting

Mentioning

150

Contrasting

Order By: Relevance

“…2) Comparison of different fine-grained algorithms . We select resnet50 [ 25 ], BCNN [ 4 ], RA-CNN [ 5 ], MA-CNN [ 26 ], WS-DAN [ 8 ], and PMG [ 27 ] to compare with the proposed method, and test them on three public fine-grained data sets. The experimental results are shown in Tables 5 and 6 .…”

Section: Methodsmentioning

confidence: 99%

Fine-grained classification based on multi-scale pyramid convolution networks

et al. 2021

View full text Add to dashboard Cite

The large intra-class variance and small inter-class variance are the key factor affecting fine-grained image classification. Recently, some algorithms have been more accurate and efficient. However, these methods ignore the multi-scale information of the network, resulting in insufficient ability to capture subtle changes. To solve this problem, a weakly supervised fine-grained classification network based on multi-scale pyramid is proposed in this paper. It uses pyramid convolution kernel to replace ordinary convolution kernel in residual network, which can expand the receptive field of the convolution kernel and use complementary information of different scales. Meanwhile, the weakly supervised data augmentation network (WS-DAN) is used to prevent over fitting and improve the performance of the model. In addition, a new attention module, which includes spatial attention and channel attention, is introduced to pay more attention to the object part in the image. The comprehensive experiments are carried out on three public benchmarks. It shows that the proposed method can extract subtle feature and achieve classification effectively.

show abstract

Section: Methodsmentioning

confidence: 99%

Fine-grained classification based on multi-scale pyramid convolution networks

et al. 2021

View full text Add to dashboard Cite

show abstract

“…They report state of-the-art results in BoxCars and competent results in CompCars. In [23], Du et al proposed a novel method that adds new layers in each training step exploiting information of the last step and a jigsaw puzzle generator to enhance network input by forming images that contain information from different granularity levels. They report results on several fine-grained classification datasets obtaining stateof-of-the-art results on Cars-196.…”

Section: B Fine-grained Vehicle Classificationmentioning

confidence: 99%

Are We Ready for Accurate and Unbiased Fine-Grained Vehicle Classification in Realistic Environments?

et al. 2021

View full text Add to dashboard Cite

Fine-grained vehicle classification from images, also known as Vehicle Make and Model Recognition (VMMR), has become an important research topic in the last years, with a growing number of scientific contributions in multiple application areas, such as autonomous vehicles, surveillance systems, traffic monitoring and management, among others. Recent techniques based on deep learning have proven to be very effective in addressing this problem. So effective that, based on the state-of-the-art results (above 95% accuracy), it would seem that the problem is practically solved. However, our main hypothesis is that the existing datasets to date have limited variability, which precludes good and unbiased generalisation of the models trained with them. In particular, it is observed that the test datasets are very similar in nature to those used for training and validation which makes these benchmarks prone to dataset bias and to overfitting. When these systems are tested with more challenging data or data from different datasets performance degrades considerably. In this paper, on the one hand, we evaluate state-of-the-art deep learning models to perform fine-grained vehicle classification and explore multiple training techniques, such as curriculum learning or weighted losses, to mitigate the bias between different makes and models and to assess the limits of current approaches. On the other hand, we analyse the existing datasets, present an additional dataset from a challenging scenario, and merge all the data into a cross-dataset that includes common samples and classes from the existing datasets. In this way, we can evaluate geographical, make and model biases, and performance and generalisation capabilities from a more realistic perspective. The obtained results suggest that we are still far from accurate and unbiased vehicle make and model recognition in realistic traffic and driving scenarios.INDEX TERMS Fine-grained classification, vehicle make and model, dataset bias, curriculum learning, weighted loss, cross-datasets.

show abstract

“…Du et al [21] approached the problem of fine-grained visual classification from a rather unconventional perspective -they do not explicitly nor implicitly mine for object parts, instead they show fine-grained features can be extracted by learning across granularities and effectively fusing multi-granularity features. The method can be trained end-to-end without additional manual annotations other than category labels, and only needs one network with one feedforward pass during testing.…”

Section: Related Workmentioning

confidence: 99%

Deep Learning-Based Object Detection Improvement for Fine-Grained Birds

Yang

Song

2021

IEEE Access

View full text Add to dashboard Cite

When the object detection algorithm is applied to the bird protection project, there are many problems like large model parameters, high similarity between bird species and single sample scene. In order to further improve the detection accuracy and stability of the object detection model, a multi-object detection algorithm for fine-grained birds is proposed. Firstly, the algorithm introduces Depthwise separable convolution into the feature extraction layer of YOLOv3 algorithm. The convolution process is divided into two parts: deep convolution and point-by-point convolution. The separation between intra-channel convolution and inter-channel convolution is realized. On the basis of high detection accuracy, the number of algorithm model parameters and calculation amount are greatly reduced. Finally, Focal loss was added to the loss function to solve the serious imbalance of positive and negative samples. By reducing the weight of the large number of simple background classes, the algorithm was more focused on detecting foreground classes. The experimental results show that, in the bird data set, the average precision mean (mAP) index of this algorithm is 2.71% higher than YOLOv3 algorithm, the number of parameters is 79.88% lower than YOLOv3 basic model, and the number of frames per second (FPS) is 19.98% higher than YOLOv3 algorithm. This algorithm not only greatly reduces the number of model parameters and computation, but also improves the detection speed and mAP.

show abstract

Fine-Grained Visual Classification via Progressive Multi-granularity Training of Jigsaw Patches

Cited by 286 publications

References 34 publications

Fine-grained classification based on multi-scale pyramid convolution networks

Fine-grained classification based on multi-scale pyramid convolution networks

Are We Ready for Accurate and Unbiased Fine-Grained Vehicle Classification in Realistic Environments?

Deep Learning-Based Object Detection Improvement for Fine-Grained Birds

Contact Info

Product

Resources

About